CANADIAN OF MICHIGAN 
OURNAL OF MATHEMA‘TICS® 


MAI H. ECON, 


Journal Canadien de Mathématiques 


VOL. VIII - NO. 4 
1956 


Spectral theory for a class of non-normal operators 
Harry Gonshor 


Groups of positive operators H. A. Dye and R. S. Phillips 
Hypersurfaces of a Finsler space Hanno Rund 


The zeros of solutions of second-order linear 
differential equations PP. R. Beesack and Binyamin Schwarz 


On the extension of measure by the 
method of Borel L. LeBlanc and G. E. Fox 


Extremal properties of Hermitian matrices 
M. Marcus and J. L. McGregor 
Non-Desarguesian projective 
plane geometries N. S. Mendelsohn 
Double transitivity in finite projective planes T. G. Ostrom 
Resolvents of certain linear groups in a finite field L. Carlitz 


Prime power representations of finite linear groups 
Robert Steinberg 
On indefinite ternary quadratic forms 
B. W. Jones and G. L. Watson 


Published for 
THE CANADIAN MATHEMATICAL CONGRESS 


by the 


University of Toronto Press 





EDITORIAL BOARD 
H. S. M. Coxeter, A.Gauthier, R.D. James, R. L. Jeffery, 
G. de B. Robinson, H. Zassenhaus 
with the co-operation of 


H. Behnke, R. Brauer, D. B. DeLury, G. F. D. Duff, I. Halperin, 
W. K. Hayman, J. Leray, S$. MacLane, P. Scherk, B. Segre, 
J. L. Synge, W. J. Webber 


The chief languages of the Journal are English and French. 


Manuscripts for publication in the Journal should be sent to the 
Editor-in-Chief, H. S. M. Coxeter, University of Toronto. Everything 
possible should be done to lighten the task of the reader; the notation 
and reference system should be carefully thought out. Every paper 


should contain an introduction summarizing the results as far as possible 
in such a way as to be understood by the non-expert. 


All other correspondence should be addressed to the Managing 
Editor, G. de B. Robinson, University of Toronto. 


The Journal is published quarterly. Subscriptions should be sent 
to the Managing Editor. The price per volume of four numbers 
is $8.00. This is reduced to $4.00 for individual members of 
recognized Mathematical Societies. 


The Canadian Mathematical Congress gratefully acknowledges the 
assistance of the following towards the cost of publishing this Journal: 
University of Alberta 
Assumption College University of British Columbia 
Carleton College Ecole Polytechnique 
Universite Laval Loyola College 
University of Manitoba McGill University 
McMaster University Universite de Montréal 
Queen’s University Royal Military College 
St. Mary’s University University of Toronto 
National Research Council of Canada 


and the 
American Mathematical Society 


AUTHORIZED AS SECOND CLASS MAIL, POST OFFICE DEPARTMENT, OTTAWA 








J) 


tow 





SPECTRAL THEORY FOR A CLASS 
OF NON-NORMAL OPERATORS 


HARRY GONSHOR 


1. Introduction. As is well known, the spectral theorem piays an im- 
portant part in mathematics because of its many applications. Unfortunately, 
the theorem is valid for normal operators only. In view of this, attempts have 
been made by several mathematicians to obtain a theorem about a more general 
class of operators, which will reduce to the ordinary spectral theorem if the 
operator is normal. Brown (1) has developed a unitary equivalence theory for 
a certain class of operators. The present paper builds up a spectral theory for a 
class of operators properly containing the set of all normal operators. The 
chief technique used is that of direct integral theory. For many reasons the 
results as well as the methods of proof seem natural to the author. 

We shall assume an elementary knowledge of Banach Algebra theory, 
spectral theory, and direct integral theory. All spaces will be assumed to be 
separable. 


2. J, operators. We begin with a definition: A is a J, operator if and only 
if there exists a direct integral decomposition of the space H such that A is 
decomposable and such that if A (#) is expressed in matrix form for each point 


t in the space with respect to which H is decomposed, it is of the form 


where the order of D is less then or equal to m. We shall use the term “pure 
J,, operator” to refer to the case where we insist that the order of D is exactly 
n. In the sequel we shall let D,, be an m-fold copy of D (m possibly infinite). 

Note that A is a J; operator if and only if A is normal. This is a consequence 
of the definition and of the spectral theorem. We also make the trivial remark 
that an mth order matrix is a J, operator. 

We shall assume that a;, de, a3,...,€ H are the basic elements used. 
This means that {a,(¢)} is an orthogonal basis for H(t) for all t where H(t) 
is the Hilbert space at the point ¢, and i runs from 1 to @ if dim H(t) = @ 
and from 1 to n if dim H(t) =n. It will also be assumed that this is the basis 
used in order to obtain the matrix form for A (2). 

Now let A be a J, operator. For each ¢ let r(t) be the least r such that A (2) 
is of the form D,, where D has order r. (By definition, r < ».) We assert that 


Received February 12, 1954; in revised form July 5, 1955 
449 











“450 HARRY GONSHOR 


r 


r(t}: is measurable as a function of ¢. To begin with, the set of all ¢ where 
dim’ H(t) is a multiple of r, is measurable. Consider the equations 


[A (t) an(t), a, (t)] = 0. 


» where m <a <n or n <a<m for some a = 1 (mod 79). This is only a 


countable number of conditions, and hence the set on which all the conditions 


. are ssimiultaneously satisfied is measurable. If it is also demanded that the 


conditions:are not simultaneously satisfied for any integer less than ro, then 
the set ‘which satisfies all the requirements is still measurable. But the require- 


- ments are nothing but a restatement of the condition that r(#) = ro. Hence 


r(t) ts measurable. This shows that a J, operator can be expressed as a direct 
sum of pure J, operators. We can therefore restrict ourselves to pure J, 
operators. 

Let H be decomposed into spaces H(t) all of which have dimension divisible 
by n. (For this purpose we adopt the convention that © is divisible by 1.) 


’ We shall consider operator functions A (#) which are of the form D,,(¢) where 


the order of D,, is m for all t. Let D be written explicitly as 


Gi(t) . . . din (2) 


Gy1(t) ; - anlt) 


LEMMA 2.1. A(t) is an operator = a,,(t) € L® for all i and j. 


~ The proof is trivial and is therefore left to the reader. We remind the reader 
- that a necessary and sufficient condition for an operator function A(t) to 
: represent an operator A is that [A (#) a,,(¢), a,(¢)] be measurable for all m and 


n and that ||A(é)|| € L®. 


THEOREM 1. A isa pure J, operator = H can be decomposed into n mutually 


« orthogonal equivalent projections E, (also called U,,) with partial isometries 
“Un; satisfying Us, Uri = 6, Us: such that A = i Aus Us where the Ay, are 


4,Jj 


~ mutually commuting normal operators which commute with the U;. 


Proof. Define E,(t) to be the projection on [a;(t), @in(t), . - . eran (8). . J 
for all i satisfying 1 < i < nm. Clearly Z;,(t) is an operator. Define U,,(¢) as the 
partial isometry from E,(¢) to E,(t) which maps @4,¢,(t) into @i4¢,(¢) for all 
integers d and all other a,(¢) into zero. It can immediately be checked that 
U,,(t) is measurable and bounded, and hence that it defines an operator on H. 
Since the property of being partially isometric is purely algebraic, equations 
in H(t) carry over into equations on H. Note that 


Us; Ur: ” 5 yx U 4. 


. Also the {E£,} are mutually orthogonal projections with sum 1. Now 


A(t) = p> Ars(t) Uas(t) 





—— 


——— 








NON-NORMAL OPERATORS 451 


where \,,(¢) are scalar functions. Hence 
A = a A ij Us 
4 


where the A,, satisfy the conditions in the right-hand side in Theorem 1. 
Now suppose A satisfies the condition in the right-hand side. It is clear 
that R(A,,, A;;*) is an Abelian ring commuting with the U,,, hence in par- 
ticular with the E;. We shall decompose with respect to R. Now R can be 
regarded as an Abelian ring of operators on the space E,;. We can therefore 
choose a set of basic elements for a decomposition of EZ; which we call f;, 
Faris fonea - » » fang... (where d is an integer). We define f,,,, where 1 <j <n 
as Us; fens1. It is easy to check that f,, f2... can be used as a set of basic 
elements for a decomposition of H in which R is precisely the set of all opera- 
tors which become scalar operator functions. By construction of the f's it is 
easily seen that U,,(¢) is nothing but the operator which, when expressed in 
matrix form, has 1’s in the (dn + 2)th row and (dn + 7)th column for all d 


and 0's everywhere else. It follows that if A ,, is mapped into the scalar function 
a,,(t), 


is mapped into D,,(t) where 
Qiu (t) .. . din(t) 
Dit) = sent 
Ani (t) eee Onn (t) 
This completes the proof of the Theorem. 


Theorem 1 gives an intrinsic characterization of J, operators in the large, 
that is, no direct integral theory is required in the definition. Alternative 
characterizations of J, operators are as follows: 

(1) R(A, A*) when decomposed with respect to its center gives rise to 
factors of type J,, only where m < n. 

(2) R(A, A*) has at most m orthogonal equivalent non-zero projections. 


3. Existence of spectral representations. We begin with certain results 
regarding measurability. 


THEOREM 2. Let T be a measure space, and let f ,(x) be nth degree polynomials 
ao(t) x* + a,(t) x*—'... + a,(t) which are defined for all t€ T and such that 
a,(t) are measurable for all i. Then there exists a measurable function y(t) such 
that y(t) is a root of f(x) = 0 for all t. 


For this purpose we shall choose a complete ordering of the complex numbers 
which will be used throughout the remainder of the paper. We let 0 be the least 
and express any other complex number in the form re‘’ where 0 < @ < 2x, 
ordering them lexicographically with respect to r and @. Theorem 2 will be 











452 HARRY GONSHOR 


shown by letting y(t) be the least root of f,(x) for each ¢. We use several 
lemmas. 


LEMMA 3.1. Let k be a compact subset of the complex plane. Then the set of 
all t such that f ,(z) has a zero in k is measurable. 


Proof. For any fixed 20, 
Go(t)zo" + a(t) ao" ' +... + a, (2) 


is a linear combination of measurable functions and is hence measurable. 
Now let z run through all points of a given countable dense subset of k. We 
conclude that [g.l.b. |f,(z)|] is measurable, hence the set where [g.1.b. |f,(z)|] 
is 0 is measurable. Since k is compact, this condition is equivalent to the con- 
dition that f,(z) has a zero in k. This proves the lemma. 

In practice the compact sets used will be one of two special types: the closed 
region bounded by the circle r = ro, and the “‘conical’’ region bounded by the 
circle r = ro and the lines @ = 0 and @ = @. 


LEMMA 3.2. The minimum modulus of the roots of f,(x) is a measurable 
function with respect to t. 


Proof. It is required to prove that the set where the minimum modulus is 
above r» is measurable. But the minimum modulus is above ro if and only if 
the region bounded by r = ry has no zeros. Thus the result follows from Lemma 
4.1. 


LEMMA 3.3. For every t take the least argument 0(t) of all roots of f,(x) of 
minimum modulus. Then 0(t) is a measurable function of t. It is understood 
that the argument is taken to be non-negative and less than 2x. 


Proof. We shall prove that the set where @(¢) exceeds 6) is measurable. 
Let s, be the closed region bounded by the circle with center at the origin and 
radius r, and let C, be the intersection of S, and the angular region 0 < 8 < >. 
Let r run through all the rationals. We now consider the statement that for 
some r, S, has a zero of f,(x) but C, does not. Since the set where S, has a zero 
is measurable with respect to ¢ and similarly for C,, the set where the above 
statement is true is measurable. (Note that only rational radii were used.) 
It remains to show that the above statement is equivalent to the fact that 
A(t) > A. 

Suppose that for some r, S, has a zero but C, does not. Then the minimum 
modulus of the roots does not exceed r. Since C, has no zero, every root of 
minimum modulus has argument greater than @. Thus @(t) > 4. 

On the other hand, suppose @(¢) > 4. If r is the minimum modulus of the 
roots, then S, has a zero and C, does not. To complete the proof we need a 
rational r having this property. Now a polynomial has only a finite number of 
roots. Hence there is a least modulus r; > r which the roots can have. Let d 
be a rational strictly between r and r;. Clearly S, has a zero, because even S, 








il 


le 


OO EE 





NON-NORMAL OPERATORS 453 


does. C, has no zeros because C, does not; and by choice of d, C, — C, has no 
zeros. This completes the proof of the lemma. 


Theorem 2 now follows trivially from Lemmas 3.2 and 3.3. 


COROLLARY. It is possible to select n measurable functions y,(t) such that for 
each t, {y,(t)} are the roots of f(x) appearing with their proper multiplicity. 
In fact, the functions may be chosen so that y;(t) < yi4i(t). 


Proof. It suffices to remark that the coefficients of f,(y)/{y — y:(t)} are 
measurable because of the nature of the process of division. The corollary 
follows by inductive use of Theorem 2 and this remark. 

We are now prepared to discuss matrix functions. Let 7 be a measure space 
and A(t) an mth order matrix defined for every t € T such that the mn? scalar 
functions a,,(¢) are measurable. As an immediate application of the previous 
theorem, there exists a measurable eigenvalue function, that is, a measurable 
function A(t) such that for every t, A(#) is an eigenvalue of A (?). 


LemMaA 3.4. If X(t) is a measurable eigenvalue function then so is the dimension 
of its eigenspace. 


Proof. This is trivial if the rank formulation is used. The dimension is r 
if and only if rank [A — XJ] is n — r. This can be expressed in terms of a 
finite number of conditions, each of which says that a certain determinant is 
zero or that at least one of a finite number is non-zero. Since a determinant 
obtained from measurable functions is measurable, the result follows. 

It may be shown by calculation that vectors [x,,(t) . . . x» (t)] can be chosen 
such that for each ¢ they form an independent basis of the eigenspace, and 
x(t) is measurable for all i and j. 

We may now develop a canonical form for J; operators. For this purpose 
it would have sufficed to consider quadratic equations and matrices of order 
two, thus simplifying the proofs. However, the work was done in general, 
since one of our main objectives is to show that the basic ideas are valid for 
any m. Since unitary equivalence theory for matrices will now be used, we 
limit ourselves to Jz operators. (The unitary equivalence classification of 
matrices of order m increases rapidly in complication as m increases. To illus- 
trate this remark, the reader may apply the subsequent technique to J; 
operators, and discover for himself how messy the algebra becomes.) 

Let A be a J, operator. Then we know that R(A, A*) decomposed with 
respect to its center gives rise to factors of type J, or J». In this decomposition 
A will be a scalar where the factor is of type J,, and a block matrix D,, where 
D is non-normal matrix of order two where the factor is of type I>. D can be 


written explicitly as 
eo ir 
@2i(t) aoo(t) J” 











454 HARRY GONSHOR 


A non-normal matrix of order two may have either one or two eigenvalues. 
If A, (¢) and A2(¢) are the two roots of the characteristic equation, then the 
condition of having only one eigenvalue is that A,(¢) = A2(t). Since A, (¢) and 
A2(¢) are measurable by the preceding theory, the two types each occur on 
measurable subsets. 

Consider the set where the matrices have different eigenvalues. The larger 
eigenvalue A(¢) is a measurable function of ¢. Also, eigenvectors [x;(¢), x2(¢)] 
can be chosen so that both x,(#) and x2(¢) are measurable. An eigenvector 
corresponding to D,, would be [x,(¢), x2(¢),0,0,...] which is measurable 
regarded as a Hilbert space function on ¢. Also [—x2(#), x(t), 0,0,...] is 
measurable. This gives a set of measurable functions f;(¢), f2(¢), . . . such that 
A(t) takes on the form 


A(t) a(é). 
0 u(t). 


when these are used as a basis. [A (#) f2(¢), f:(¢)] is measurable, and hence so is 
its argument @(#). Therefore e~**‘"f.,,(¢) is measurable. Now if 


fild), cP Fa(Z), ... 


are used as a basis, a(¢) is real and positive. 
In the same way functions can be chosen on the set where A (¢) has only 
one eigenvalue so that A(t) will take on the form 


A(t) a(é). 
0 w(t). 


where a(¢) is real and positive if these are taken as a basis. It is well known that 
the matrices D(#) we now have are canonical forms of the unitary equivalence 
classes of non-normal matrices of order two. This proves 


THEOREM 3. If A is a J: operator, then A can be decomposed so that for every 
t, A(t) ts either a scalar or a matrix of the form D,,(t) where D(t) is of the form 


(3°) 


where a is real and positive and X > yu. In particular the matrices D(t) of order 2 
that appear are either identical or else not even unitarily equivalent. 


Consider the space Z V Q where Z is the complex plane and Q is the set of 
all triples (A, u, a) where \ and uw are complex such that A > uw and a is real 
and positive. We now transfer the direct integral decomposition to the space 
ZU Q. A projection-valued measure will be defined on ZU Q.1f BC ZUQ 


it makes sense to speak of f-'(B) by making the obvious identification of 
» € Zand (A, w, a) € Q with 








_— = te bd 


e @ovTr* —_{j " 


it 





NON-NORMAL OPERATORS 456 5 


saa, 


respectively and then defining f-'(B) = {t: A(t) € B}. We define B to be - 
measurable if f-'(B) is measurable, and then let E(B) = E[f-'(B)]. This 
gives a projection-valued measure for which the Borel sets are measurable. 

H decomposes with respect to the projection-valued measure on ZU Q. 
Also, the range of the projection-valued measure on Z  (Q is a subset of the - 
range on the original space. We restrict ourselves for a moment to the space 
where A(t) is not scalar. Define Un, Ui, Un, and U2: as in the proof of 
Theorem 1, for example U2; corresponds to the operator function which ‘is + 


OO se 
10 


>= & ee 6 


for all t. Observing the original decomposition, we see that the U’,, commute °- 
with all the projections in the range of the measure. Therefore, a fortiori, the 
U,,; commute with all the projections in the range of the measure on Z U Q. 
As in the proof of Theorem 1, it is possible to choose the f's in such a way that . 
Un becomes 


on ZU Q and similarly for Ui2, Un, and U2. 
Since ||A (¢)|| € L® in the original space, it follows that £(Q) = 0 outside’ - 
of a Cartesian product of bounded sets. 
Now subdivide Z and Q into a finite number of Borel sets {Z,} and {Q,}, 
and choose a point z, € Z, and (A,, w,,a@;) € Q, for each Boel set. We define 
an operator function A,(#) on the original space. A,(#) = 2, if A(t) € Z;, 


and 
Ay a; . A(t) a(t) . 
A,(t) =| 0 w, .Jif A = 0 w(t). 


where [A(¢), u(#), a()] € Q;. We denote the projection on the set where - 
A,(t) = 2, by E, and the projection on the set where 


Ay @;. 
A,(t) =| Ou;. 











456 HARRY GONSHOR 


by F;. Then A, is simply 
DE + DL OF Un + UF Un + 0,F Ui) 
7 


in view of the definition of U,,. By definition of the projection-valued measure 
on Z U Q, the projection on Z, is E, and on Q, is F;. It follows from this, that 
if A, is deé6mposed with respect to Z  Q, it becomes the function which is 


Ay G;. 
z,on Z,and| 0 uw, . } on Q,. 


We now choose a sequence of operators of the form A, which approach A 
uniformly. (Note that the property of uniform convergence is preserved under 
direct integrals.) By using this sequence, we easily see the important fact that 
if A is decomposed with respect to Z\V Q it becomes the function which 
is 

AO. Aa. 
Or.}atAand{0 uz. }at (A, w, a). 


THEOREM 4 (the spectral theorem). Jf A is a J: operator, then there exists a 
direct integral decomposition onto Z\/) Q such that A is decomposable and such 
that 

AO. Aa. 
A(t) =|0X.}atXand|0 xu. jat (A, uw, a). 


It is natural to call this “the spectral theorem” and even to use the symbolic 
notation A = fr dE(A) because, at least from the point of view of direct 
integral theory, it generalizes the usual theorem for normal operators. 


4. Uniqueness of spectral representations. In this section we show 
that the measure obtained on Z LU Q is unique when A is given. To simplify 
the notation, the matrices A (¢) will be written as if they have only one block. 
This is legitimate since any ring of the form 


RO. 
OR. 


is isomorphic and isometric to R. 

Since we are given that the identity function \ corresponds to A, we know 
by direct integral theory that f(A, A*) corresponds to f(A, A*) where f is a 
polynomial function. The general idea will be to approximate characteristic 
functions by such polynomials, not uniformly, but closely enough to ensure 
that the projections on the sets depend only on A. For normal operators, the 
Stone-Weierstrass approximation theorem can be immediately applied and the 














NON-NORMAL OPERATORS 457 


proof is rather trivial. As Kaplansky (3) has shown, the theorem can be 
generalized to the case where the range is in a B* algebra, but a certain amount 
of caution must be used in applying the theorem. 

For example, separation of points is not enough. In fact, consider a compact 
set of matrices of order two in the usual topology. We remark that if the set 
consists of two distinct matrices which are unitarily equivalent, then the 
B* algebra generated by the identity function is not dense in the space of all 
continuous functions even though the algebra separates points. This is because 
any scalar function in the algebra necessarily has the same value at unitarily 
equivalent matrices. Kaplansky’s theorem asserts that if any two points a 
and 5 can be separated in the sense that there exists an f such that f(a) = 1 
and f(b) = 0, then the algebra is dense in the set of all continuous scalar 
functions. 


THEOREM 5. If x and y are any two distinct points in Z ) Q, then there exists 
a polynomial function f such that f(x) = 1 and f(y) = 0. (* is regarded as a 
polynomial operation.) 


Proof. We regard x and y as matrices in the natural manner and consider 


The structure of the space Z  Q ensures that x and y are not unitarily equi- 
valent. Suppose there exists a polynomial function f such that 


Ao 5) |= (6 0)- 


Then it follows that f(x) = 1 and f(y) = 0. Thus the problem has been reduced 
to the study of the algebra of polynomials on a matrix of the type 


AO 

0B 
where A and B are not unitarily equivalent, and each of them together with 
its adjoint generates the full matrix ring of which it is a member. The latter 
part of the statement is true by definition of Z  Q. (For example, Q does not 


contain points such as (A, u, 0).) 
It is known that there exists a polynomial function f such that 


A(0 »)|-[0'r): 


where U is any matrix in the full matrix ring of which A is a member and 
similarly for V. This was proved independently in the thesis (2) by making 
use of the classification theorem for finite dimensional rings of operators as 
found in (5). The result follows by taking U = 1 and V = 0. 








+458 HARRY GONSHOR 


Although the question of separation of points has been answered, other 

« difficulties arise. Q is not compact in its usual topology. The addition of points 

(A, #, 0) will make Q compact (if we assume that |A|, |u|, @ <_||A||) but will 

c«destroy the property of separation of points. For example, it is clear that the 
| points (A, w:,0) and (A, we, 0) with uw: * we could not be separated, for if 


10 10 
i¢ ° = 1 and r¢ A = 0, 


t then f(A) = 1 and f(A) = 0 which is impossible. Accordingly we regard Q 
«-as a locally compact space and search for “enough” functions vanishing at 
infinity. 
Definition. Let X be an arbitrary locally compact space. A family of com- 
pact subsets &, is said to form an inverse base for compact sets if every compact 
- subset of X is contained in at least one kg. 


Exampie.. In the Euclidean plane, a countable inverse base for compact 
=-sets can be obtained by taking circles of integral radius and center at the 
«> Origin. 

The concept of an inverse base is useful, because many properties necessarily 
“« hold for all compact sets if they hold for sets of a base (which are often easier 
to handle than arbitrary compact sets). For example, the condition for a 

normal family of functions of a complex variable can be expressed in terms of 
sets of an inverse base. 

We restrict ourselves to the subset of Q where |A| < '|A||, |u| < ||A||, and 

a < ||A||. In any decomposition of A it is clear that the measure is concentrated 


on this subset. No confusion should arise if Q is used to denote this subset . 


Lema 4.1.” The sets Q., where a runs through all positive real numbers and 
Qa ts the set of all (x, u, a) € Q such that a > a, form an inverse base for compact 


> sets of Q. 


Proof. Clearly Q. is compact for all a. Let k be a compact subset of Q. 
’ The mapping (A, u,@)— a is continuous, and therefore has a minimum a 
~ when restricted to k. Obviously k C Qa. 


COROLLARY. A function on Q vanishes at infinity if and only if it approaches 
- Zero, as a approaches zero, uniformly with respect to d and yp. 


LemMA 4.2. All polynomials of the form fg — gf vanish at infinity. 


Proof. 
t= 0 9) [om Ge” ann J- (00): 


> Also fg — gf is continuous on Q LU (A, u, 0), and is hence uniformly continuous. 
’ The rest follows by the corollary to Lemma 4.1. 

Now let x and y be two distinct points in Q. By the strong form of Theorem 
- 5 {see the end of the proof), there exists a polynomial f such that 


——_—— 











— 


es 


us. 


—_— 


en a TT 








NON-NORMAL OPERATORS 459 


fix) = (9 6) and fo) = 0. 


Consider (ff* — f*f)?. By Lemma 4.2, this vanishes at infinity since algebraic 
combinations of functions vanishing at infinity still vanish at infinity. The 


function is clearly zero at y and computation shows that it is one at x. This 
proves 


LemMA 4.3. If x and y are any two distinct points in Q, there exists a poly- 
nomial function f vanishing at infinity such that f(x) = 1 and f(y) = 0. 


Notice that not all polynomial functions on Q vanish at infinity, but we have 
shown that those that do vanish, separate points. By Kaplansky’s theorem 
(3) the set of polynomial functions that vanish at infinity is dense in the set 
of all scalar functions vanishing at infinity. 

Next we approximate characteristic functions by continuous functions. 
It is clear that the Borel sets in Q are generated by the compact sets. Let k 
be a compact set. Since Q is a metric space, k’ is a countable union of closed 
sets F, C F.C F; C.... By applying Urysohn’s lemma to the F; in suc- 
cession we obtain continuous functions which are uniformly bounded (in fact 
by 1) and which tend pointwise to the characteristic function on k. By means 
of these results we can now solve the uniqueness problem stated at the begin- 
ning of the section. 

We know that f(A, A*) corresponds to f(A, A*) for all polynomial functions 
f. Since all continuous scalar functions vanishing at infinity can be uniformly 
approximated by polynomials, they correspond to well defined limits of 
sequences of operators each of which is of the form f(A, A*). A pointwise 
limit of a uniformly bounded sequence of functions corresponds to the strong 
limit of the sequence of operators corresponding to the functions by direct 
integral theory. This means that the projections on compact sets are unique. 
This implies uniqueness of the projection-valued measure restricted to Q. 
In particular, the function which is 1 on Q corresponds to a well-defined 
projection. 

So far, all functions considered were 0 on Z. Now by subtraction, the func- 
tion which is 1 on Z and 0 on Q corresponds to a unique projection. (Of course, 
the function which is 1 everywhere corresponds to 1 and therefore to something 
unique.) Multiplying by the identity function \ which corresponds to A, 
we see that the function which is the identity on Z (A at A) and 0 on Q corres- 
ponds to a unique operator. It is now easy to verify that the projection- 
valued measure is unique on Z. (The proof is essentially the same as the proof 
for uniqueness on Q except that the details are much simpler for Z.) 

Before stating the fundamental theorem it is convenient to make a definition. 


Definition. If E is a projection-valued measure on Z LU Q, 


A = frxdE(a) 











460 HARRY GONSHOR 


means that if A is decomposed with respect to the measure, it becomes an 
operator function of the form 


. 2a 

ewe ws AO. 
-A@a at (A, uw, a) and| 0 A . J ata. 
“Oma 


THEOREM 6. There is a one-one correspondence between J: operators and 
projection-valued measures on Z \/ Q concentrated on a compact set such that if 
A corresponds to E, 

A = frdE(a). 


As a corollary we have the theorem of unitary equivalence. 


THEOREM 7. There is a one-one correspondence between unitary equivalence 
classes of J, operators and collections of null sets which include the complement 
of some compact set together with a measurable multiplicity function where the 
range consists of the positive integers and the symbol No, and the range is even 
valued on Q; where the correspondence is obtained by Theorem 6 together with 
multiplicity theory. 


Remark 1. If we have a decomposition of A in the manner described in 
Theorem 6, then the decomposition is necessarily with respect to the center 
of R(A, A*) i.e. the center of R(A; A*) is precisely the set of all operators 
which decompose into scalar functions. 


Remark 2. Not only does A determine the E’s uniquely but even the U;,'s 
are uniquely determined. (The U;,’s are used in the same sense as before with, 
for example, U,; corresponding to the function which is 


10. 
0on Zand{00.jonQ. 


To verify the second remark it is necessary to make use of another theorem 
by Kaplansky (3) which also considers functions with range in a B* algebra. 
The theorem asserts that if any two points x and y can be separated in the 
sense that for any two elements G and H in B*, there exists an f such that 
f(x) = G and f(y) = H, then the algebra is dense in the set of all continuous 
functions with range in the B* algebra. 

Thus a J: operator has a rich supply of projections associated with it, 
“‘vertical’’ (such as U,,) as well as “horizontal.” 


5. Applications. Brown (1) obtained a unitary equivalence theory for a 
class of “‘binormal” operators. He approached the problem in a purely algebraic 
manner completely avoiding direct integral theory. We now state without 








nd 








NON-NORMAL OPERATORS 461 


proof various relations between his approach and ours. More detail can be 
found in (2). 

A is binormal = A is a J; operator. 

The normal kernel of A (according to Brown) is the projection on Z. 

A? = 0 = measure on Z  @ associated with A is concentrated on 0, and 
(0, 0, a). 

A? = A = measure on Z \ 0 associated with A is concentrated on (0), 
(1), and (1, 0, a). 

The last two results enable us to find a unitary equivalence theory for 
nilpotent (of order two) and idempotent operators. 

It is useful to have other forms of the spectral theorem which do not involve 
cirect integral theory or refer to the specific space Z  Q. 

If A is any J; operator, then there exists a sequence {A,} of operators 
approaching A uniformly such that for all , A, has the property that the 
Hilbert space can be split into a finite number of orthogonal spaces on each of 
which it is of the form 


Aa00 
AO. 0400 
OrA.}or;}/00Aa 
00048 


Note that for J; operators this corresponds to the well-known fact that A can 
be uniformly approximated by operators of the form 


> h, Ey 


6. Generalizations. The following generalizations are possible. 

(1) We may define unbounded J, operators. The results will still be valid 
except for the fact that the measure is not concentrated on a compact set. 

(2) The main theorems can be extended to the case where m > 2. The 
chief difficulty is that instead of the space Z  Q, we shall have Z\.U QU Q; U 
Q,...\/Q, where the spaces increase rapidly in complication. (Even Q; is 
complicated!) However, the basic ideas still go through. 


REFERENCES 


1. A. Brown, On binormal operators, Amer. J. Math., 76 (1954), 414-434. 

2. H. Gonshor, Spectral theorem for a class of non-normal operators, Ph.D. thesis, Harvard 
University, June 1953. 

3. I. Kaplansky, The structure of certain operator algebras, Amer. Math. Soc. Trans., 70 (1951), 
219-233. ; 

4. B. E. Mitchell, Unitary transformations, Can. J. Math., 6 (1954), 69-72. 

5. J. von Neumann, On rings of operators, reduction theory, Ann. Math., 50 (1949), 401-485. 

6. J. von Neumann and F. J. Murray, On rings of operators 1V, Ann. Math., 44 (1943), 772-773. 


Pennsylvania State University 











GROUPS OF POSITIVE OPERATORS 
H. A. DYE anp R. S. PHILLIPS 


1. Introduction. Semi-groups of bounded positive operators on certain 
function spaces enter the theory of stochastic processes of the diffusion type 
in an essential way. It is a matter of experience that these semi-groups cannot 
be imbedded in groups of positive operators, or, in more special terms, that 
the solution of a diffusion equation does not define a one-parameter group of 
positive operators on the natural function space. The present work originated 
with an effort to explain this circumstance by showing, under appropriate 
conditions, that a group of positive operators will solve only a first order 
partial differential equation (see §3). Aspects of this problem, however, pointed 
the way to a general study of group representations by bounded positive 
operators on Co(X), the space of real-valued continuous functions vanishing 
at infinity on a locally compact Hausdorff space X. A typical problem arising 
here, for example, was that of determining when a bounded positive group 
representation on C»(X) is equivalent to a pure flow (or isometric) representa- 
tion. In the main, then, this paper deals with general questions of this type. 

The existence of a certain canonical factorization of elements in a group of 
positive operators provides the technical basis for our study. Expressly, any 
representation ¢ — U, of a group G by bounded positive operators on Co(X) 
splits into a product U,=L«.,.)7, of a flow representation o—T, of G by iso- 
metries of C,)(X) and pointwise multiplications by functions in P(X), the class 
of all positive continuous functions on X bounded away from 0 and infinity. 
In particular, the group of all positive operators on Cy(X) belonging to a given 
flow splits into a semi-direct product of that flow by P(X). While this theorem 
is not essentially new, its implications have not been studied extensively. 

It develops that equivalence properties of the positive representations of 
G on C,(X) hinge on the analysis of certain functional identities. The multipli- 
cation factors @(-, ¢) arising in the factorization of U, satisfy the characteristic 
identity 0(-, or) = 0(-¢, r) 6(-, c), and the representation [U,] will be equiva- 
lent to a pure flow (in a natural sense) if and only if @(-,¢) has the form 
6(-,0) = g(-)/g(-c), for some g in P(X). To provide a natural algebraic 
vehicle for this analysis, certain elementary notions from the cohomology 
theory of Eilenberg and MacLane (2) are discussed in §4. Algebraic techniques 
suggested by this theory, and involving in particular cohomology group 
H'(G, P(X)), are employed variously throughout the rest of the paper. 

Received July 25, 1955. Portions of this research were supported by the Office of Ordnance 
Research, U.S. Army, under Contracts DA-36-034-ORD-1292 and DA-04—495-ORD-613. 


In addition, the investigation was started while the second named author held a John Simon 
Guggenheim Fellowship. 


462 














GROUPS OF POSITIVE OPERATORS 40g 


Our main results in this direction are contained in §§5, 6, and 7. In.§5, we 


) 
’ 


prove that a bounded positive representation belonging to an ergodic flow is - 


already equivalent to that flow, and show by example that bounded -repre- 
sentations are not in general equivalent to flows. In §6, we study the auto- 
morphism group of the group of all positive operators on C)(X) belonging to 
a given flow. Theorems here concern the semi-direct product structure of the 
group of bounded automorphisms, and the characterization of the group of 
flow-related automorphisms modulo inner automorphisms. In. §7, we show 
that the adjoint representation of a given strongly continuous bouaded 
positive representation of a topological group G on C)(X) wilt be equivalent to 
the adjoint of the flow representation provided only a Borel measurable 
factorization of the multiplication factor @(-,¢) exists. Equivalence of the 


adjoint representations of two positive representations [U,] and [V,.] of G © 


on Cy X) implies that the spectra of U, and V, coincide. 


Th. appendix contains an application of the foregoing theory te semi-groups » 


of operators. Two one-parameter groups of operators are exhibited with the 
property that the sum of their infinitesimal operators has no extension generat- 
ing even a semi-group or operators. 


2. Factorization of positive operators. Let L(X) [resp., Co(X)] denote 
the algebra of all real-valued continuous functions with compact support 


[resp., vanishing at infinity] on the locally compact Hausdorff space X. We ~: 


take these r~~-es with the customary norm on continuous functions, namely 


\lf|| = sup If(x)|, 


so that L(X) is dense in Co(X). A positive operator on L(X) [resp. Co(X)] is - 


by definition an everywhere defined linear transformation of L(X) [Co(X)] 


into itself which carries non-negative functions into non-negative functions. ; 


Our discussions will center on the class of bounded positive operators which 
have bounded positive inverses—a decisive restriction—and in this section, 
we derive the basic factorization theorem cited in the introduction. 


LEMMA 2.1. Let U be a bounded regular operator on L(X) {resp. Co(X)], 
which together with its inverse is positive. Then there exists a positive continuous 


function p(-) on X, bounded away from 0 and @, and a homeomorphism x— xa - 


of X such that 
(2.1) U = L,T~, 


where L, denotes pointwise left multiplication by p(-) and T, is the automorphism 
(T.f) (x) = f(xo) of L(X)[Co(X)] implemented by «. Components in this factort- 
zation are uniquely determined, and one has ||U|| = ||p\|. 


Proof. Suppose that U is a bounded positive operator on C)(X) having a 
bounded positive inverse. We prove first the useful fact that U maps L(X) 
onto itself. For this, it suffices to prove that Uf lies in L(X) for each f in the 











464 H. A. DYE AND R. S. PHILLIPS 


positive cone of L(X), namely L+(X). Consider then an f € L*+(X) and choose 
an h € L*+(X) which assumes the value | on all of the support of f. We approxi- 
mate to Uh by a function g € L*+(X) chosen so that 


\|Uh — gl| < (2||U- ||), 0 < g < Uh. 


We have 0 < U-'g < hand ||U-'g — h|| < 4. It follows that U-'g > } on the 
support of f, and therefore, the support of Uf lies in the (compact) support of g, 
proving our assertion. In view of this fact, it will clearly suffice to prove the 
theorem for L(X). 

We next show that there exists a one-to-one map of X on itself, x — xe, 
such that g(xc) = 0 if and only if Ug(x) = 0 for each g € L(X). Again it is 
clear that we can restrict our considerations to L+(X). We further note that 
U and its inverse being positive implies that U describes a linear order iso- 
morphism in L(X) so that U(f V g) = Uf V Ug. Now for fixed x € X, 
we set 

I =([h;h € L*+(X), Uh(x) = 0]. 


It follows from the above remarks that J is a closed positive cone, closed with 
respect to the lattice operation V, and neither empty nor all of L*+(X). Let 
Z(h) = [y; A(y) = O). If hy, he € J, then hy V he € I and 


Z(hi V he) = Z(hi) A Zh). 


Consequently if F = (\[Z(h); h € I] is disjoint from a given compact set 
C, then there is an hk € I with Z(h) (\ C = ¢ (the null set). Now if g € L*+(X) 
has C as its support, then 0 < g < ah for a sufficiently large and therefore 
0 < Ug(x) < aUh(x) = 0 so that g € J. Since J is a proper subset of L*+(X), 
it follows from this that F is necessarily non-empty. On the other hand, F 
can contain no more than one point. For if y:, y2 € F, it is easy to construct 
functions k;, ke € L+(X) such that k,(y,) > 0, i = 1,2, and ki A ke = O. 
Thus 
O = U(ki A ke) (x) = (URi(x)] A [Uk2(x)]. 


Consequently either k; or k: lies in J, so that F cannot contain both y; and yz. 
Denote the single point in F by xe. We see that Ug(x) = 0 implies g(xc) = 0, 
and xe is the only point for which this assertion holds for all g € L+(X). 
On the other hand if f € Z*+(X) and f vanishes identically in a neighborhood 
of xo, then a compact support for f is disjoint from F = {xe} and hence as 
above f € I. Now any g € L*(X) with g(xc) = 0 can be approximated in 
norm by functions of the type f and hence any such g belongs to J; that is 
g(xoc) = 0 implies Ug(x) = 0. Finally to show that ¢ maps X onto itself we 
have only to derive the corresponding assertions for U~' and note that these 
involve o~' in place of c. 

For each x € X choose a g, € L(X) so that g,(xo) = 1; set p(x) = Ug,(x). 
Then for any f € L(X), f — f(xe) g, vanishes at xo and therefore 


Uf (x) = f(xo) « g2(x) = p(x) f (xe). 








1- 








GROUPS OF POSITIVE OPERATORS 465 


This is the desired representation of U. It follows from this representation 
that p(x) is non-negative and bounded on X and, since 


U-*f(x) = [p(xo")]“' f(xo-"), 
we see that p(x) is also bounded away from 0. To prove that x — xe is a 


homeomorphism let a neighborhood N(xoc) be given and choose f € L+(X) 
so that N(xoc) is a support for f and f(xoc) > 0. Since Uf € L(X), the set 


N(xo) = [x, p(x) f(xe) > 0) 


defines a neighborhood of x» with the property that [N(xo)]¢ C N(xoc). 
The mapping ¢ is therefore continuous and a similar argument applied to 
U-' establishes the continuity of o~'. It is now easy to prove that p is continuous 
on X. In fact if N(xor) is chosen to have a compact closure, then there exists 
an f € Lt+(X) with f(x) = 1 for all x € N(xoc). In this case p(x), which 
is identical with Uf(x) in N(xo) = N(xoc)o~', is seen to be continuous at 
at xo. Finally we note that the uniqueness of the factorization (2.1) follows 
trivially from the fact that L,7, = I (identity operator) entails p = 1, 
a =e (the identity homeomorphism), and so all parts of the Lemma are 
proved. 

When X is compact the Lemma implies the following result, due to Kadison 


(6): 


Coro.Liary. If X is compact, then any linear order isomorphism of L(X) 
which conserves the identity is implemented by a homeomorphism of X. 


With this, we pass to a characterization of groups of positive operators. 
Some notation is needed. Given a group G and a topological space X, we 
say that C acts on X if a representation of G in the group of homeomorphisms 
of X is given. A flow of G in Co(X) (X locally compact) is a representation 
o — T, of G by a group of isometries of C)(X). Given a flow G in C)(X), one 
can find an action ¢ — xe of G on X such that (7.f)(x) = f(x) for all f in 
Co(X) (cf. (1) and (10)). 


THEOREM 2.1. Let o— U, be a representation of a group G by bounded 
positive operators cn Co(X). These operators U, have a factorization 
(2.2) U, = Lu., aie 


where, for each o, 0(-, 0) is a positive continuous function on X, bounded away 
from 0 and infinity, and « — T, is a flow representation of G on Co(X) imple- 
mented by an action x — xa of G on X. These functions 0(x, a) satisfy 

(2.3) 6(x, or) = O(xo, 7) O(x, o) and 0(x, e) = 1. 

If G is a topological group, and tf the representation « — U, is strongly continuous, 
then 

(2.4) the mapping (x, 0) — xo is continuous on X X G to X, and 


(2.5) the function 0(x, 0) is continuous on X X G. 











466 H. A. DYE AND R. S. PHILLIPS 


Conversely, given an action x — xa of G on X and a function 0(x, ¢) subject to 
all the above conditions, then « — U, = Ly., «7. defines a strongly continuous 
representation of G on L(X). 


Proof. The representation identity (U,)~' = U,-: assures that each U, 
has a bounded positive inverse, and therefore, the existence of the factorization 
(2.2) follows from the Lemma. Now, 


Lu., ond ov = Dos = U.U, = Leu., ol oLu., nl: = Lu, o)@-e, ated e 
By the uniqueness of factorization, therefore, T,, = 7,7, and 
0(-, or) = 6(-¢, r) 0(-, a), 
proving the first part of (2.3). That @(-,¢) = 1 follows trivially from this 
identity. 
We turn to the topological properties. First, given a compact C in X and 
an open V > C, we argue, there exists a neighborhood N of the identity e 


in G so that CNC V. In fact, choose an h in L(X) which is 1 on C and 0 
outside V, and then apply strong continuity to choose N so that 
|0(-,¢) h(-0) — h(-)| <1, 
for all ¢ in N. Trivially, this entails CVC V. 
With this, we can see that (2.4) holds: given xo, oo, and a neighborhood 
V of xoao, choose a neighborhood U of x» with compact closure which satisfies 
(U-) «oC V. Now, by the preceding paragraph, choose a neighborhood N 
of e so that (U-) ooNC V. This proves (2.4). 
For (2.5), note first that it suffices to prove joint continuity at each pair x, 
e (e the identity); in fact, given this, suppose lim x. = x» and lim og = e. Then 


lim @(xa, govg) = lim 0(xaco, og) -lim 0(xa, oo) = O(X0, oo), 


so joint continuity will be established in general. To prove joint continuity 
at x, e, consider any compact set C in X. Choose D compact with C in its 
interior, choose a symmetric neighborhood N of e in G so that CNC D, 
and finally, let f be a function in L(X) which is 1 on D. Since f(xe) = 1 on 
C X_N, we then have, for x in C and @ in N, 


|a(x, 0) — 1| = |[0(x, 7) — 1] f(xe)| < |6(x, o) f(xe) — f(x)| + |f(xo) —f(x)|. 
Shrinking N if necessary, we can arrange that the first term on the right be 
arbitrarily small for all x € C by appealing to the strong continuity of U,; 
the second term vanishes on C by our choice of f. Thus (2.5) is established. 
The converse follows readily from the fact that, for any f in L(X), we can 
choose a neighborhood N of e so that |f(-c) — f(-)| < , for all « in N, to- 
gether with the fact that the numbers @(x, c) will be uniformly close to 1, 
for x in the support of f and ¢ in some neighborhood of e. It may be noted 
here that the converse will hold with Co(X) replacing L(X) if it is known 
that the functions @(-, 7) are uniformly bounded for « in some neighborhood 
of the identity; this will be the case, for example, if G is locally compact. 


on Le ie 
er 











| 
| 
| 
| 
| 











GROUPS OF POSITIVE OPERATORS 467 


3. The infinitesimal operator of a group representation. When G 
is the additive group of real numbers with the usual topology, then one can 
define an infinitesimal operator for a strongly continuous representation 
[U,; — © <t< ©@] of G by bounded positive operators on C)(X) as 


(3.1) in 2 = ifm Af, 

90 n 
where the domain of A, in symbols D(A), consists of all f € Co(X) for which 
this limit exists. It can be shown (see (5, chap. IX)) that A is a closed linear 
operator with dense domain and that for f € D(A) 





(3.2) “Us = AU,f, —o <t< o, 


Thus if A happens to be a differential operator, then u(t, x) = U, f(x) satisfies 
the differential equation 


(3.3) © u(t, x) = (Ault, )](@). 

We shall now determine the precise form of the infinitesimal operator for 
the above group representation under the assumption that A is a differential 
operator. This requires a certain amount of specialization. In the first place 
the concept of a differential operator does not make sense unless X is a 
differentiable manifold. In general this is not in itself sufficient and we therefore 
make the following additional assumption, which has the effect of imposing 
a degree of local regularity on A. 


AssumpPTIon D. All functions of class C“™ with compact supports belong to 
D(A). 


THEOREM 3.1. Let [(U,; — © <t < @] be a strongly continuous group of 
linear bounded positive operators on Co(X) to itself where X is an n-dimensional 
manifold of class C™. If the domain of the infinitesimal operator satisfies 
Assumption D, then there exists a continuous scalar B(x) and a continuous 
contravariant vector field &(x) such that 


(3.4) [Af](x) = a(x)- V f(x) + B(x) f(x), fe DIA)NC, 
where VY is the gradient operator. 


Proof. Since this is a local problem, we may suppose that X is represented 
in a neighborhood N(x») of a given xo € X by the euclidean coordinates 
(x!, x?,..., x"). It follows from Assumption D that D(A) contains a function 
fo(x) which is identically one in some neighborhood, say N(x»), of xo. Hence 
making use of the representation (2.2) and the property (2.4), we see that 
there exists an N2(xo) and a 6; > 0 such that 


a "[U,fo — fo)(x) = 9~*[0(x, 0) folxn) — fo(x)] = 2-"[0(x, 2) — 1) 











468 H. A. DYE AND R. S. PHILLIPS 


for all x € N2(xo) and |n| < :. Since fy € D(A), the incremental ratio 
7 '[U.fo — fo] converges in norm as 7 — 0 and therefore 


) _ 
91 t)| mo — B(x) 


exists uniformly in N2(x»o). It follows that 8(x) is continuous in x; it is obvious 
that 8(x) does not depend on the local coordinate system. 

The domain of A also contains functions f;(x) = x‘ — xo‘ (i = 1,2,...,m) 
in some neighborhood N3(xo). Again by (2.4) there exists an N4(xo) and a 
62 > 0 such that 


a [Usfs — fal (x) = 9 [0(x, 0) flxn) — fi(x)] 
= 9 '[0(x, 9) — 1][(xen)* — xo] + 9" [(xn)‘ — x] 


for all x € N4(xo) and || < ds. As before the limit exists uniformly in x as 
n — 0 and since the first term in the right member converges to a limit we see 
that 


s (xt) *| mo = a(x) 


exists. The limit being uniform with respect to x in N,(x»), it follows that 
a‘(x) is continuous in N,4(xpo). 
Finally suppose f € D(A) (\ C™. Then writing 


a '[Uf — f\(x) = 2 [0(x, 2) — 1) f(xn) + of (xn) — f(x)] 


and passing to the limit as 7 — 0, we obtain 
n P a 
[Af\(x) = B(x) f(z) + 2 a(x) Safle) 


for all x € Ne(xo) (\ Na(xo). We see from this expression that the a‘(x) are 
the components of a contravariant vector field in the above local coordinate 
system. This completes the proof. 


We note in particular that the infinitesimal operator of [U,] cannot be a 
second order differential operator. As a consequence the solution to a diffusion 
equation can never define a strongly continuous group of linear bounded 
positive operators on Co(X). 


4. Introducing the cohomology group H'(G, P). Consider two repre- 
sentations [U,‘”] and [U,®] of a group G on L(X) by bounded positive 
operators. Write P(X) (or simply P) for the class of all positive continuous 
functions on X which are bounded away from 0 and infinity. We call these 
representations [U,‘] P-equivalent if, for some p in P, 


(4.1) LU©L- = U,™. 
Suppose (4.1) holds, and let 6(-,¢) and 7, denote the corresponding 


multiplication and flow factors of the U,", in the sense of (2.2). The unique- 
ness of factorization shows that these constituents are related by 











GROUPS OF POSITIVE OPERATORS 469 


(4.2) T = T,, 

and 

(4.3) a (-, o)[p(-)/p(-2)] = 0 (-, «). 

In particular, if @& = 1, so in other words U, is a pure flow, and if we write 
simply U, for U,, then (4.3) becomes 

(4.4) 6(-,0) = p(-)/p(-e). 


In other words, a necessary and sufficient condition that a representation 
of G on L(X) by bounded positive operators be P-equivalent to a pure flow 
is that its multiplication factor @(-, 0) have the form (4.4), for some function 
p in P. (In this connection, one should remark that the notion of P-equivalence 
is not so restrictive as might first appear; see Corollary 6.1 below.) 

We see then that significant properties of positive representations of G 
on L(X) are connected with functional identities (for example (2.3) and (4.4)) 
involving their multiplication factors. On the other hand, as the informed 
reader will note, these identities are cohomology statements in the sense of 
the Eilenberg-MacLane cohomology theory (2, p. 55). While our work here 
has a very limited contact with this theory (in that we study only H'(G, P), 
the first cohomology group of G with coefficients in P), it is none the less 
advantageous to adopt a few of these notions for our purposes. These we 
review in the following. 


Definition 4.1. Let G be a group, X a locally compact Hausdorff space, 
and P the multiplicative abelian group of all positive continuous functions 
on X which are bounded away from 0 and infinity. Assume that G acts on X. 
When G is a topological group, we say this action is continuous if the mapping 
(x, ¢) + xe is jointly continuous. By a cochain (more precisely, a 1-cochain) 
we mean any function 6(-,¢) on G to P(X). If G is topological and acts 
continuously, we call a cochain continuous if it is jointly continuous on X X G. 
A cocycle is a cochain satisfying the identity (2.3), viz. 


O(x, or) = O(xe, r) O(x, oc). 


The multiplicative abelian group of cocycles is denoted Z'(G,P). A 
coboundary is a cochain 6(-,¢) having the form @(x,c) = p(x)/p(xc), for some 
p in P. B‘(G, P) will denote their group (clearly, a subgroup of Z'(G, P)). 
By the (first) cohomology group H'(G, P) (of G with values in P) we mean 
the quotient group Z'(G, P)/B'(G, P). 

Returning to the study of positive representations of G on L(X), let us 
say that a positive representation [U,] of G on L(X) belongs to the given flow 
[7.] of G on L(X) if there exists a cocycle 0(-, 0) in Z'(G, P) so that U, = 
Lu., «1. By (4.3), any representation of G on L(X) P-equivalent to [U,] 
also belongs to the flow [7,] and has for its multiplication factor a cocycle 
cohomologous with @(-, ¢). These remarks in conjunction with Theorem 2.1 
give 








470 H. A. DYE AND R. S. PHILLIPS 


LeMMA 4.1. There is a natural 1:1 correspondence between P-equivalence 
classes of representations of G on L(X) by bounded positive operators belonging 
to a given flow and elements of the cohomology group H'(G, P) taken relative 
to the same flow. Under the correspondence, representations equivalent to the flow 
correspond to the identity in H'(G, P). 


The base space Cy(X) could have been used rather than L(X) in the above 
discussion. For G topological [resp. locally compact], an obvious variant of 
the discussion applies to strongly continuous representations on L(X) [resp. 
C.(X)] and continuous cocycles. Finally we note for G merely topological, 
‘that the cobounding continuous cocycles are necessarily bounded and hence 
are in 1:1 correspondence with the strongly continuous representations on 
C.(X) which are P-equivalent to the flow. 


Example 4.1. Suppose that G is compact, and that a continuous action of 
G on X is given. Then any continuous cocycle @(-, a) in Z'(G, P) is trivial 
(viz. a coboundary). In particular, therefore, any strongly continuous repre- 


sentation of a compact group on L(X) by bounded positive operators is equi- 
valent to a flow. 


In fact, given the continuous cocycle @(-, c), define 
p(x) = fe0(x, o) do, 
where de is an element of Haar measure of G. Trivially, p lies in P(X). 
Further, 
p(xr) = fo0(xr, «) do = [6(x, r)}-' fo0(x, ro) do = [0(x, r)}-' p(x). 


This proves that @ is a coboundary, and the other assertions follow automati- 
cally. 


5. On bounded representations. By a bounded positive representation 
of G on L(X) (or Co(X)) we mean a representation by uniformly bounded 
positive operators. If [U,] is such a representation, and if 6(-, ¢) is its cocycle, 


then the relation ||U,|| = ||@(-, o)|| (from Lemma 2.1) shows that 6(-, 7) < M, 
for all ¢ and some constant M. This and the identity @(-, ¢)~' = @(-¢, o~") 
show in turn that 

(5.1) M-' < 0(-,c) < M, for all o. 


We shall call a cocycle 6(-,«) bounded if it satisfies a relation (5.1). Our 
argument shows therefore that a positive representation of G on L(X) is 
bounded if and only if its cocycle is bounded. 

We shall deal in this section with the problem of determining when bounded 
positive representations are equivalent to pure flow representations. This 
comes in other words to determining conditions (on X, or on the flow) under 
which bounded cocycles are coboundaries. The following lemma (with M = the 
class of all positive functions on X bounded from 0 and infinity) shows that 
bounded cocycles do indeed cobound when X is discrete. 























GROUPS OF POSITIVE OPERATORS 471 


Lemma 5.1. Let 0(-,¢) be a bounded cocycle in Z'(G, P). Let M be a class of 
positive functions on X, bounded away from 0 and infinity, which contains the 
functions 0(-, 0) and contains, along with f, the function 0(-, 0) f(-o). Then, 
if h(-) = GLB, 0(-, o) exists relative to M, we have @(-, 0) = h(-)/h(-¢). 


Proof. We are assuming here that h lies in M, that h(x) < 6(x, ¢), for all 
x and o, and that any other f in M with this property must satisfy f < h. 
Fix on + in G. Then for all x and a, 


h(xr) < 0(xr, o) = 0(x, ro) /O(x, 7), 
or 6(x, r) h(x r) < (x, rv). Our assumptions about M and h imply now that 
6(x, r) h(xr) < h(x). Substituting xr for x and r~' for r, and making use of the 
relation 6(xr, 7") = [0(x,7)]-', gives the opposite inequality. Therefore, 
6(x, r) h(xr) = h(x), as asserted. 
When X is a Stone space—that is, a locally compact Hausdorff space for 


which C(X) is a conditionally complete lattice— (see (11)), then we apply 
the Lemma with M = P(X) to obtain 


Coroiiary 5.1. If X is a Stone space, then each strongly continuous bounded 
positive representation of a topological group G on L(X) (or Co(X)) is P-equiva- 
lent to a flow representation of G on L(X) (resp. Co(X)). 


We call a given action (x, ¢) — xo of a group G on X ergodic if each orbit 
xG = [xe|e € G] is dense in X. As we now show under general conditions, 
this restriction on the flow suffices to eliminate non-trivial bounded cocycles. 


THEOREM 5.1. Let G be a group which acts ergodically on the Hausdorff 
space X. Then each bounded cocycle in Z'(G, P) is a coboundary. 


Proof. Fix on a bounded cocycle @;(-, ¢) in Z'(G, P). In order to simplify 
notation in the proof, we shall deal with @(x, 0) = log 6,(x,¢) rather than 
with 6,, so our conditions on @ are 


(5.2) —M < 6(x,c) < M and 6(x, or) = 0(xo, r) + 0(x, c), 

for all x, ¢, r. Our task is to exhibit a bounded continuous function h(-) on X 
satisfying 

(5.3) G(x, 0) = h(xo) — h(x), 

for then p(x) = exp(—h(x)) will give p(x)/p(xo) = 0,(x, ¢) and p € P. 


To begin with we assume that X is a single orbit, X = x9G, and establish 
the Theorem in this special case. 


By Lemma 5.1, there exists a bounded (but not a priori continuous) function 
h(-) on X so that (5.3) holds. Since (5.3) will also hold when hA(-) is replaced 
by h(-) + ¢ (¢ constant), we can assume that h(x 9) = 0. Therefore, 

(5.4) if x = xoo, then h(x) = 6(xo, c). 


(Note from this that xo¢ = xor will entail 0(xo, ¢) = 6{xo, r).) We shall prove 
that this function / is necessarily continuous. 








472 H. A. DYE AND R. S. PHILLIPS 


Grant that we have proved continuity of h at xo, namely, 
(5.5) lim x2 = X9 entails lim h(x.) = 0. 


We show that h is then everywhere continuous. For suppose lim x. = y = Xoe. 
By (5.5), lim h(x. o~') = 0, so we have from (5.3) that 


lim h(x) = lim h(x o~') — lim 0(xa, o~') 
= — O(y,0-') = — O(xoa, ao") = O(x0, 0) = h(y). 
We now prove (5.5). As the basis of an indirect proof, we can assume 


. (replacing h, 6 by —h, —@ if necessary) that 
lim sup h(y) > e > 0. 


y>7Z0 


Each neighborhood of x» will then contain a point y = xo¢ for which @(xo, c) 


> «. Choose any o = a; for which @(xo, 71) > ¢«. Assume elements @;,..., o» 
of G have been chosen so that 
(5.6) ten se...01) > (2-10 -[£4... +a]. 


Choose a neighborhood N of x» so that y in N gives 
O(Y, On. . - 01) > O(x0, on... 01) -% 


and then choose o,4; so that Xoo,4, lies in N and @(xo, on41) > ¢. We then have 


0(xo, On+l--+- o1) = O(x0en+1, On+s+> o1) + 6(xo, On+1) > ne — E +...+ Ay 9 


2 2” 
so o, is defined for all m and (5.6) can be realized. This inequality shows that 
(5.7) O(x0, On... 01) > (m — 2)e, for all n. 


But this contradicts the boundedness of @. Therefore the Theorem is proved 
in the single orbit case. 

We turn next to the general case. Choose any orbit Il = xeG and any 
point x in X. By what we have proved, there is a bounded function h, defined 
and continuous on II, and satisfying 


(5.8) O(y, c) = h(yo) — hy) (y in Il, ¢ in G). 
Accordingly, for any subset S of X, 
(5.9) Var san h(-) < Var ony h( -c) + Var, (-,¢). 


By the continuity of / on II, given « > 0, there exists a neighborhood U of xo 
so that 


Vata h(-) < 4e. 


Since xG is dense, by assumption, we can choose ¢ in G so that xe € U, and 
in turn, we can choose a neighborhood V of x so that VoC U. Shrinking V 

















ee = 








GROUPS OF POSITIVE OPERATORS 473 


if necessary, we can assume that Vary @(-, 0) < }e, because 6(-, a) is con- 
tinuous on X. With S = V, (5.9) then yields 

(5.10) Vaty nn h(-) <«. 
Therefore, if x. is a sequence in II with lim x, = x, then lim A(x.) will exist 
and depend only on x. This shows that h(-) extends to a function A(-) on the 
closure X of II which in particular satisfies 


(5.11) Vary, h(-) <«. 


Since x is arbitrary, it follows that hf is continuous on all X. In particular, 
continuity shows that (5.8) holds for this A and all y in X. This proves Theorem 
5.1. 

A fortiori, if G is a topological group acting continuously and ergodically 
on X, then any bounded cocycle is automatically continuous. We summarize 
the implications of this Theorem as they apply to bounded positive repre- 
sentations. 


COROLLARY 5.2. Let o-—+ U, be a strongly continuous bounded positive 
representation of the topological group G on C,(X). Assume that no non-trivial 
closed ideals of Co(X) are invariant under this representation. Then |U,]| is 
P-equivalent to a strongly continuous flow representation « — T, of G on Cy(X). 


Proof. Let [T,] be the flow associated with [U,] (as in Theorem 2.1). 
We know that o — T, is strongly continuous. That the action of G on X 
implemented by 7, is ergodic follows from the well-known characterization of 
closed ideals in Cy(X) (viz. as the class of all functions in C)(X) vanishing on 
an arbitrary closed set). The bounded cocycle associated with [U,] is therefore 
a coboundary, by the Theorem, and the conclusion follows from the remark 
following Lemma 4.1. 


Example 5.1. We conclude this discussion by showing that bounded co- 
cycles do not in general cobound. 

For X take the two-point compactification [— ©, + @] of the reals, and 
for G the additive group of real numbers with the usual topology. Define the 
action (continuous) of G on X by setting xt = x + ¢ for x finite and xt = x 
for x infinite. Next, define 


2 forx = — o, 
p(x) = 42 + sin |x|! for x finite, 
2 forx = + o~, 


and set 6(x,t) = p(x)/p(xt). A straightforward computation shows that 
6(x, t) is a bounded (continuous) cocycle in Z'(G, P). However, p(-) does not 
belong to P(X) since it is not continuous on the closed interval [— @, ~]. 
Now, it is easy to see that any positive function g defined on (— @, ~) 
with the property 6(x, ¢) = q(x)/q(xt) must be a positive constant multiple 











474 H. A. DYE AND R. S. PHILLIPS 


of p; no such multiple can have a continuous extension on [— ©, ~]. It 
follows that @(-, ¢) cannot be a coboundary. 


6. Automorphisms of groups of positive operators. Let F be a group of 
homeomorphisms of the locally compact Hausdorff space X. In abuse of 
notation, we shall also write F for the group of isometries [7,] of L(X) imple- 
mented by the ¢ in F. Denote by M the group of all multiplication operators 
[L,; p € P] on L(X). Finally, denote by @ the group of all regular positive 
operators with positive inverses on L(X) whose canonical factorization 
L,T, has T, € F. We may describe @ as the group of all positive operators 
on L(X) belonging to the given flow F. Group-theoretically, G is the semi- 
direct product FM of the subgroup F and the (normal abelian) subgroup M. 
This section concerns a study of automorphisms of the group G, and our results 
here will serve to clarify the significance of some of the algebraic formalisms 
we have adopted. 


LEMMA 6.1. M is a normal maximal abeiian subgroup of G, and any other 
normal abelian subgroup of © already lies in M. 


Proof. lf we write p* for the function p*(x) = p(xc), then the relation 
T.L,T.' = Ly 


shows that M is normal. Suppose that U = L,T, is any element of @ com- 
muting with all elements of M. We have 


Lg(LpTe) = LpT ole = Loy To, 


or gp = q’p, or gq = q’, for all g in P. This clearly entails ¢ = e (the identity), 
and it follows that U = L, lies in M, proving that M is maximal abelian. 

Suppose that U = L,T, is an element of some normal abelian subgroup 
N of G. For each p in P, N will then contain 


L,(L,T.)Ly-1 -” Live) Te, 
and this element of N will in turn commute with U. The commutation relation 
(Lipo) Te) (L,Te) _ (L,T.) (Liew) Te) 
gives 
pqq’/" = qq"p"/p"’, and hence pp” = (p")’ 


for all ». Again, it is easy to conclude that o = e, so that U lies in M. This 
proves the Lemma. 


We call an automorphism ¢ of @ bounded if, for some constant K and all 
U in G, 








K-"||U|| < ||e(U)|| < KU]. 
The inverse of a bounded automorphism is automatically bounded, and we 
may speak therefore of the group Aut,,(@) of bounded automorphisms of @. 














Se 








GROUPS OF POSITIVE OPERATORS 475 


LEMMA 6.2 Assume X compact. Then each bounded automorphism ¢ of @ 
carries M onto itself, and on M has the form 
(6.1) o(L,) = L,, 


where r is a homeomorphism of X. 


Proof. The characterization of Lemma 6.1 shows that any automorphism 
of @ will carry M onto itself. We shall deduce (6.1) from the corollary to 
Lemma 2.1 which asserts that any linear order isomorphism WV of L(X) which 
conserves the identity (X compact) is implemented by a homeomorphism of 
X. For this, define 


(6.2) ¥(f) = log g(exp f), f € L(X). 
Then ¥—'(f) = log g~'(exp f), and it is clear that V is an additive isomorphism 
of L(X) on itself. We next show that W preserves order. For any x in X, q 
in P(X), and n > 1, we have 

feta) (2) = e(¢")(x) < Kilg'll = Kilall’ 
and therefore 
(6.3) ¢(q)(x) < |Iqll- 
In particular, if f > 0, then g = exp —f < 1 and ¢(¢) < 1. Thus 

e(expf) = o(¢") = leg) > 1 


and hence ¥(f) = log ¢(q~') > 0. By the same token, ¥~' must preserve order, 
and it follows that ¥ is an order isomorphism and therefore linear. Substitution 
of a positive scalar c for qg in (6.3) and the corresponding statement for c~' 
shows that ¢(c) = c. From this it follows that ¥(1) = 1 so that V conserves 
the identity. This yields (6.1). 


LemMA 6.3. Let ¢ be any automorphism of © with the property that its restric- 
tion to M has the form 
o(L,) = Ly, 


for some homeomorphism + of X. Then + lies in the normalizer N(F) of the flow 
F, and for all L,T, in @, 


(6.4) o(L,T.) = TA(Le., »LyTs)Ts', 
for some cocycle 6(-,a) in Z'(F, P). 
Proof. Define g’(U) = T,-'¢(U)T,. ¢’ maps @ isomorphically on another 
group of positive operators on L(X), and ¢’(L,) = L,, for all p in P. Therefore 
Lye’ (T.) = ¢' (LyT.) = ¢'(T.)(T.'LT.), 


showing that the positive operator ¢’(7T,)7,~' commutes with all L,. Lemma 
6.1 applied to the group of all positive operators on L(X) then shows that 
¢’(T,)T,— lies in M, for each o € F. We write 


¢’ (T.) = Lauw., o 1 








476 H. A. DYE AND R. S. PHILLIPS 


A simple calculation shows that @ lies in Z'(F, P). Moreover, 
¢(T.) = La. oT ser-1. 


This operator lies in G, and therefore rFr~'C F. The same argument for 
@-', r~' in place of ¢, r gives inclusion in the other direction, and we find that 
F = rFr—, or by definition, r lies in N(F). This proves the lemma. 


We can now obtain some significant information about the structure of the 
automorphism group of G. One may note the formal similarity of the theorem 
to follow to a theorem of I. Singer on the automorphism group of a finite 


factor (9, Th. 3.3). 


THEOREM 6.1. Suppose X compact. Then the group Autyg(@) is isomorphic 
to a semt-direct product N(F) Zyq of groups ist morphic respectively to the nor- 
malizer of the flow F in the group of all homecmorphisms of X and the group of 
all bounded cocycles in Z'(F, P). Here +r € N(F) implements the automorphism 
0, of Zog given by 8,(0(-,0)) = 0(-7, rer). 


Proof. We associate with each r in N(F) the automorphism a, of G defined 
by 
a,(7,L,) = T,(T,L,)T+-1. 
It is trivial that a, # e when r = e, and that 


Aeyrg = AryAry. 


It follows that r — a, maps N(F) isomorphically into Aut,4(@). Next, associate 
with the bounded cocycle @ in Z'(F, P) the bounded automorphism ag of G, 


ae(T.L,) = La. a oe am 


Again, it follows readily that @ — a» is an isomorphism of Z,, into Aut,,(@). 
Moreover, 

aaeat,-1(TeLy) = Leyer.r-t0r) TeLy, 
so that the image of Z,, is normal in Aut ,4(@). Lemmas 6.2 and 6.3 show that 
each bounded automorphism ¢ of @ has a factorization ¢ = a,as, and it is 
readily seen that this factorization is unique. This proves the Theorem. 


Following a similar pattern, we now give an interpretation of the cohomology 
group H'(F, P). For this, we call an automorphism ¢ of G flow related if ¢ coin- 
cides on M with some automorphism a, (¢ in F), in the sense that g(L,) = L,. 


THEOREM 6.2. There exists a natural isomorphism between H'(F,P) and 
the group Aut,,(G)/Inaut(@) of flow related automorphisms of © modulo inner 
automorphisms. 


Proof. It follows from Lemma 6.3 that each flow related automorphism 
¢ has a unique factorization ¢ = a,as, (r € F,6 € Z' = Z'(F, P)), so in the 
notation of Theorem 6.1, we have 


(6.5) Aut,,(@) = FZ". 

















eo - 








GROUPS OF POSITIVE OPERATORS 477 


Suppose @ is an element of B' = B'(F, P), 0(-,c) = p(-)/p(-c). Then 
a,ae(T,L,) = T.L,(T.L,)L,'T.-", 


so that ¢ is inner. The argument here clearly reverses, and we see that 


(6.6) Inaut (G) = FB’. 
Suppose r € F, @ € Z'. Then 

(6.7) O,-10c9at,09-1 = ag, for some 6 € B’. 
In fact, 


(ax,-rergererg-1) (T Lg) = Loger-1, vor-1 0, ) Tele, 
and 
B(-, 0) = 6(-7-', ror") /0(-, @) 6(-, or") O(-7—"', r)/8(-, @) 
= 6(-, or~")/0(-, 7") O(-, 0) = O(-a, +") /O(-, r~"). 

If we set p(-) = [0(-, r~')}-’, then it is clear that B(-,c) = p(-)/p(-c) € B', 
proving (6.7). 

As a characteristic subgroup, Inaut(G) is normal in Aut,,(@). Observe 
now that the automorphisms 


Ar 1HG,, Argo, 


lie in the same coset mod Inaut(@) if and only if 6, is cohomologous with 6. 
In fact, using (6.7), we have 


1 
O84 — 10k pg 10709, = Ae, 107, CigQtg, -1049, (B € B), 


and this automorphism is inner if and only if 
a,-100, € B', 


or equivalently, if and only if 6; is in the same coset of Z' mod B' as 6,. It 
follows that the mapping which carries the coset of the automorphism a,ay 
on the coset of @ is an isomorphism onto. 

We conclude our study of automorphisms of @ by a brief consideration of 
automorphisms implemented by bounded operators on L(X). Since any such 
operator can be extended to be regular on C)(X) we may, without loss of 
generality, choose the latter as our base space. 


LemMA 6.4. Let W be a bounded regular operator on Co(X) with the property 
that U —+ WUW~' defines an automorphism of &. Then there exists a bounded 
positive operator V on C,(X) with a bounded positive inverse such that 
WUW- = VUV-~ for all U in ©. 


Proof. Let 8(X) be the Stone-Cech compactification of X and denote by 
@’ the unique extension of @ to a group of positive operators on C(8(X)). 
In the obvious way, W-W-' defines a bounded automorphism of @’. By 
Lemma 6.2, therefore, there exists a homeomorphism +r of 8(X) such that 
WL,W-' = L,, for all p in P(8(X)). For h in L(X) and x in 8(X), we will have 


W(ph)(x) = p(xr) W(h) (x). 











478 H. A. DYE AND R. S. PHILLIPS 


Linearity shows that this must hold for all p in C(8(X)). We now show that 
7(X) C X. Suppose on the contrary that r maps some x in X into r(x) € B(X) 
—X. Choose h in L(X) so that (Wh) (x) # 0, and then choose p in C(6(X)) 
so that p(x) = 1 on the support of A (which lies in X since it is a compact 
subset of X) and so that p(xr) = 0. This gives W(h)(x) = W(ph)(x) = 0, 
which is impossible. This argument applies as well to W-', r~', and we see 
therefore that r(X) = X. Hence if we set W’ = T,-:.W, then W’ is a regular 
bounded operator on Co(X) which satisfies the relation 
(6.8) W'L, = L,W’, for all bounded f in C(X). 
We shall complete the proof by showing that any bounded regular operator 
W’ satisfying (6.8) has the form L,, for some g € C(X) with |g| in P(X). 
The operator V = T,L;,; will then be positive and will implement the same 
automorphism of © as W-W-'. We now let [0.] denote the collection of all 
open sets of X with compact closures, and for each a we choose a function 
ha in L(X) which is 1 on Og. Set ga = W'ha. If h in L(X) vanishes off 0., then 
(6.8) gives 
(6.9) W'(h) = W' (hal = hW'(ha) = hg. 


Therefore, if x € 02 \ 0s, h(x) = 1, and if h vanishes off 0. (\ 0g, then 
fa(x) = h(x) ga(x) = (Wh) (x) = h(x) ge(x) = ge(x). 


So ga = gs on 0./\0 5. Define a function g on X by setting g(x) = ga(x) 
if x € 0.. This function g is then well defined and continuous, and (6.9) shows 
that W’h = gh, for any h in L(X). It follows from this that ||g|| = ||W’|| < @ 
and that g does not vanish. If we apply this argument to W’—', to obtain a 
bounded & such that W’"h = kh for all h in L(X), then 


h = W'(W""(h)) = hkg, 


so kg=1, and ||g~"|| is finite. Therefore |g| lies in P, and the Lemma is proved. 


This lemma has application to the representation theory. Our work in 
§§4 and 5 was based on the notion of P-equivalence of representations. On the 
other hand, in the conventional sense, two representations [U,] and [V,] 
of a group G on C,(X) are equivalent if there exists a bounded regular operator 
W on C,(X) so that WU,W-' = V, for all o € G. If we require in addition 
that this operator W determine an automorphism of the group of all positive 
operators on Co(X), in the sense of the preceding lemma, then we can just as 
well assume to begin with that W is a positive operator. Knowing the form of 
positive operators with positive inverses, however, we infer from this 


Coro.uary 6.1. Let ¢ — U, be a bounded positive representation of the group 
G on C,(X), and suppose there exists a bounded regular operator W on C,(X) 
such that ¢ — WU,W~' is a flow representation of G, and such that U — WUW-" 
defines an automorphism of the group of all positive operators on Co(X). Then 
[U.] is already P-equivalent to a flow representation. 

















ss Ww Ft FF oa eecCUlUr 














GROUPS OF POSITIVE OPERATORS 479 


7. The adjoint representation. Suppose that o— U, is a strongly 
continuous representation of a topological group G on C,(X). Denote by 
C.(X)* the adjoint space of C)(X), that is, the space of bounded linear func- 
tionals A on Co(X) with the norm |/A|| = sup); ,);.1 |A(f)|. It is well known 
that elements of Cy)(X)* can be represented as integrals on C,(X) relative to 
signed Borel measures on X of finite total variation (4, chap. X). Associated 
with the representation [U,] is an anti-representation ¢ — U*, of G on Cy(X)* 
defined by (U*,A)(f) = A(U,.f). This anti-representation is in itself not a 
natural object to study, since it will in general fail to be strongly continuous. 
However, study of the forward diffusion equation in semi-group theory has 
suggested a natural refinement of these notions (see (3 and 7)). 


Definition7.1. By the adjoint representation [U°,, D(U°,)] toa given strongly 
continuous representation ¢ — U, of G on Co(X), we mean the pair consisting 
of the representation ¢ — U®, of G on C,.(X)* defined by 

(U%d)(f) = A(U.-*f), 
together with a subspace D(U®,) of Cy(X)*, called the domain of the adjoint 
representation, and consisting of all A in Co(X)* for which the mapping 
o — U®,d is strongly continuous. 

Two adjoint representations [U°,, D(U°,)] and [V°,, D(V°®,)] of G will be 
called equivalent if D(U°,) = D(V®,), and if there exists a bounded regular 
operator W on C,(X)* such that W-'U®,W = V®, for all o € G. In the case 
W([D(U®,)] = D(U*,). 

To bring this notion of domain into clearer focus, we note that each operator 
U®, maps D(U*,) into itself so that the restriction of U°, to D(U®,) defines a 
strongly continuous representation of G. As to the extent of D(U®,), we now 
discuss the situation for G locally compact in the following 


REMARK 7.1. Suppose that G is a locally compact group and that ¢ — U, 
is a strongly continuous bounded representation of G on C)(X). Given A 
in Co(X)* and h in L(G), define 


(7.1) *(f) = f h(o) (Uf) de, f in Co(X), 
where the integral is taken relative to left invariant haar measure on G. Then 
(1) \* lies in D(U®,) so that ¢ — U°,d* is strongly continuous, 
(2) the set of all such \” (A and A varying) is strongly dense in D(U®,), and 
(3) D(U®,), the strong closure of the set in (2), is W*-dense in Co(X)*. 


Proof of (1): 
In (U,-af — f)| < J\h(r*e) — h(o)| -|A(U.-sf)| do < K||f\| -||A(r* -)—h(-)| Ia, 


for some constant K. Since r — h(r~!-) is continuous on G to L(G), it follows 
that d” lies in D(U®,). 


Proof of (2): Suppose A € D(U®,). Then given ¢« > 0, there exists a neigh- 
borhood N of the identity e in G such that 











486 H. A. DYE AND R. S. PHILLIPS 


|A(U.-f — f)| < ellfll, 
for all f € Co(X), « € N. Choose a non-negative h in L(G) vanishing off N 





and so that {h(o) de = 1. Then 
(7.2) |A(f) — Jh(o) A(Ue-sf) do| < Sh(o) -|AYF — U,-xf)| do < «|/f\| 
and therefore ||A — A*|| < «. 

Proof of (3): Suppose next that A is an arbitrary element of C)(X)*. Then 


given « > 0 there exists a neighborhood WN of e in G, depending on f, such 
that |A(U.-1f — f)| < «¢ for all ¢ € N. Choosing h as above, the first inequality | 
in (7.2) shows that |A(f) — A*(f)| < « and therefore that [\"; h € L(G)] is? 
W*-dense in C)(X)*. Finally we note that D(U°,) is strongly closed since the 
U®, are uniformly bounded. 

The above argument can readily be extended to the case where [U,] is 
merely strongly continuous, if one makes greater use of the local compactness 
of G. 

Consider now a strongly continuous bounded positive representation 
[U. = La. «»7T~] of the topological group G on C,(X). We recall (Theorem 
2.1) that the flow representation [7,] is also strongly continuous. We wish 
to determine conditions under which the adjoint representations [U°,, D( U®,)| 
and [7°,, D(T°,)] (the ‘“‘adjoint flow representation”) are equivalent. For 
this purpose, we shall say that the cocycle @(-,¢) of [U,] has a measurable 
factorization if 
(7.3) 6(-,0) = p(-)/p(-e), 
for p a positive function on X, bounded away from 0 and infinity, and measur- 
able, in the sense that its contraction to each Borel set is measurable. (Here, | 
the Borel sets consist of the o-ring generated by compact sets.) If p is such a 
function and if \ € Co(X)* is represented by the signed Borel measure A(£), 
then we see that the functional 


uy(f) =f p(x) f(x) A(dx) 





again lies in Co(X)* and 

uy(E) = Sx p(x) (dx) 
for all Borel sets E. Heuristically one can write u,(f) = (pf). We can therefore 
define a bounded linear operator W on C,(X)* by 


(7.4) (Wd) (f) = up(/). 
With this definition, we then have 
(7.5) (W-'T°,W) X = U®,n, 


for all \ in Co(X)*. In order to prove this we note that 


(T°) (f) = A(Te-uf) = ff (xo) A(dx) = ff (x) A(dxo), 
so that (T°.A)(E) = A(Eo). Hence 








ore 





— 











GROUPS OF POSITIVE OPERATORS 481 


(WT. W) (AIG) = (WT) (fb) Max) 1) = (Wf. (x) A(dx)) 1) 
= Sf (2) (feelo(y) IU axe (x) A(dx)]} 
= Sf (2) [p(2)]"* p(2e) A (dee) 
= ff(xo™*) (x, o~*) X(dx) = A(U.-f) = (UA) (/), 


as asserted. 

To establish the equivalence of [U°,, D(U®,)] and [7°,, D(T°.)] under the 
assumption (7.3), it remains to show that D(U°®,) = D(T°,). This will follow 
from 


LEMMA 7.1. Assume d lies in D(T°,). Let p be a bounded non-negative func- 
tion on X, measurable relative to each Borel set. Then the linear functional 
up(f) = A( Pf) also lies in D(T*,). 


Proof. As is well known, \ can be expressed as the difference of two bounded 
positive functionals, \ = A, — A». The bounded positive functional (A, + As) 
induces a regular Borel measure m on X with 

m(X) = LUB([m(C); C compact] < @. 
Take any 6 > 0. Choose a compact set K, so that m(X) — m(K,) < 6. 
By Lusin’s theorem, we can find a compact K C K,sothat m(K,) — m(K) <6 
and the restriction p|K of p to K is continuous. Next, we can extend p|K 
to a non-negative element pj of L(X) with preservation of the bound M of p. 
This gives 
ltr (f) — ma(f)| < (Ar + A) (If(> — BI) < 25M)|f\I. 

It follows therefore that yu, is a uniform limit of functionals yz, p in L(X). 
Because D(T°,) is strongly closed, it will therefore suffice to prove the lemma 
under the initial assumption that p € L(X). In this case pf € L(X) and 
uy(f) = A(Pf) is strictly correct. For any r in G and f in Co(X), 


(7.6) |up(T+-1f — f)| < |MT.-1 (fp) — fp]| + Ca + As) [|(7>-) b — T,-1 (fp) |]. 
Since A € D(T®,), there exists a neighborhood JN, of e in G so that the first 
term is < 4e, for all f of norm <1. Choose a symmetric neighborhood N of e, 
NC WM,, so that ||p(-r) — p(-)|| <6, + € N. The second term in (7.6) 
becomes 

S\f (er) (p(x) — pler-)| m(dx) < af |f(er)| m(dx) < 4|If\| (\|Aal] + |lAall), 
for all r in N. We can assume 6 chosen to make this bound < $e, again for all 
f of norm <1. Therefore, for + in N and ||f|| < 1, (7.6) has the bound e, and 
the Lemma is proved. 


It follows from this that the operator W of (7.4) will carry D(T°,) into 
itself. Since the same must be true of W-', we have 


(7.7) W(D(T°.)] = D(T*,). 











482 H. A. DYE AND R. S. PHILLIPS 


According to the relation (7.5), U°,A and T°,Wd will be strongly continuous 
together so that D(T°,) = W[D(U°,)]. Consequently D(U°,) = D(T°,), 
and we have established 


THEOREM 7.1. Let o— U, = Leu.,«)T. be a strongly continuous bounded 
positive representation of the topological group G on C,(X), [T.] being the 
associated flow representation. If the cocycle 0(-, 7) has a measurable factorization, 
then the corresponding adjoint representations [U°®,, D(U°,)| and [T°,, D(T®,)] 
are equivalent. 


As an indication of the existence of measurable factorizations, we prove the 
following two lemmas. Here, as elsewhere, a real-valued function on the 
locally compact Hausdorff space X is called measurable if its contraction to 
each Borel set is measurable in the customary sense. 


LEMMA 7.2. If G is a separable topological group acting continuously on X, 
then each bounded cocycle in Z'(G, P) has a measurable factorization. 


Proof. Let {c,} be a countable dense subset of G and set h(x) = GLB, 
6(x, ¢,) (pointwise). Denoting by M the class of all measurable functions on 
X which are bounded away from 0 and infinity, we see that 4he function h 
lies in M. On the other hand, if for fixed x we apply Theorem 5.1 to the single 
orbit II = xG, we perceive that 6(x, a) is continucus in o and hence h(x) = 
GLB, 6(x, 0). Employing Lemma 5.1, with M defined as above, we obtain 
O(x, 0) = h(x)/h(xo). 


LeMMA 7.3. If G is a o-compact locally compact topological group acting 
continuously on X, then each continuous cocycle in Z'(G, P) has a measurable 
factorization. 


This result is an immediate consequence of Lemma 5.1 (with M defined 
as in the proof of Lemma 7.2) and the following 


LEMMA 7.4. Suppose that G and X are topological spaces, G o-compact 
locally compact and X merely locally compact. Let f(x,a) be any real-valued 
continuous function on X XG with f(x,c) > 0. Then the (pointwise) GLB, 
f(x, 0) is a measurable function on X. 


Proof. Fix a compact subset C in G. We shall prove that b(x) = GLB ac 
f(x, a) is measurable. This will, in effect, establish the Lemma; for G is a 
union of an increasing sequence {C,} of compact sets, and if 5, denotes the } 
corresponding to C,, then GLB, b,(x) is measurable and equal to GLB, f(x, c). 

To prove that b is measurable, let F be any compact subset in X. For 
each x in F, choose a neighborhood N(x) of x so that |f(x,o0) — f(y,a)| < 1/2n 
for all y in N(x) and o in C. Further, given x, choose co, in C so that 
f(x, or) < f(x, c¢) for all ¢ in C. Now a finite number N(x,),..., N(x,) of 
the N(x)’s cover F. Finally set 


h,(x) = inf[f(x, o.,); 4 =1,...,7]. 














GROUPS OF POSITIVE OPERATORS 483 


For all x in X, we clearly have b(x) < h,(x). Take a pair (x, ¢) in F & C, 
say x lies in N(x,). Then 


ha(t) <f(, on) < flew en) +5 <fleue) +3 < fee) +*. 


Thus x in F entails h,(x) < b(x) + 1/m. We now define hy(x) = GLB, h, (x). 
This function hy is clearly measurable, satisfies b(x) < hy(x) for all x, and 
b(x) = hp(x) for all x in F. Finally we note that any Borel set can be covered 
by the union of an increasing sequence of compact sets, say {F,}. But 
GLB, hp, (x) is measurable and equal to b(x) on this union, that is, on the given 
Borel set. It follows that 6 is measurable in the generalized sense. This con- 
cludes the proof. 


We see from the foregoing material that the equivalence of the adjoints of 
two strongly continuous positive representations is easier to establish than 
the equivalence of the original representations. On the other hand if the 
adjoints of two linear bounded operators, say U and V, are equivalent (in 
the sense that there exists a linear bounded regular operator W on C,o(X)* 
such that V* = WU*W-"), then the spectra of U and V coincide (see, for 
instance, (7, Theorem 1.5)). In particular, if [U,] is a strongly continuous 
bounded positive representation of a separable G or of a o-compact locally 
compact G, then it follows from this fact together with Theorem 7.1 and 
Lemmas 7.2 and 7.3 that the spectrum of U, coincides with that of the associ- 
ated flow operator 7, for each o € G. 

Actually, spectral problems are best dealt with in the setting of a complex 
linear space rather than a real linear space. For a complex linear Co(X), the 
notion of positivity remains the same as before and, in fact, everything we 
have established applies with obvious modifications. 


8. Appendix. We close this paper with an application which is of interest 
in the theory of semi-groups of operators. We shall exhibit two one-parameter 
strongly continuous groups of operators on the complex linear space Co(X) 
having infinitesimal operators A; and As», respectively, with D = D(A,)()\ 
D(Az) dense in Co(X), such that no extension of A; + A» (defined on D 
generates a strongly continuous semi-group of operators. 

Set X = (— o, ~), let G be the additive group of real numbers with the 
usual topology, and define xt = x + ¢ and 


ez+t 
(8.1) 6(x,t) = ex] j B(r) ar| ‘ 


If B(x) is continuous in x and if 


er+t | 
sun) f B(r)dr| ;- ~ <x< = | < @ 


for each t, then it is easy to see th 1t @(x, f) is a continuous cocycle in Z'(G, P). 











484 H. A. DYE AND R. S. PHILLIPS 


Moreover such a @(x, ¢) will cobound if and only if it is bounded, that is, if 


and only if 
J ae) ae 
0 
a suitable P-factor being 


(8.3) p(x) = ex] - J 80) ar| ' 


We note that p(x) is continuously differentiable and bounded away from 0 
and infinity. 

Let [7,] denote the flow representation: 7 f(x) = f(x +2). A straight- 
forward computation shows that the infinitesimal operator of [7,] is given 


(8.2) sup, < ©; 








(8.4) Adf (x) = f' (x) 

with 

(8.5) D(Ao) = [f; f(x) continuously differentiable, f and f’ € Co(X)]. 
Suppose next that 6(:) satisfies the condition (8.2). Then the corresponding 


representation: U, f(x) = @(x,t) f(x+ 2), is equivalent with the flow 
[7 ,]; in fact, 


(8.6) U = L,T ,L,", te€ G, 


where L,f(x) = p(x) f(x). It follows from this that the infinitesimal operator 
of [U,] is given by 


(8.7) Aif(x) = [L,AoL,~'f](x) = f’(x) + B(x) f(x) 
and 
(8.8) D(A,) = L,[D(A))]. 


We now choose 
[n jln*(x —n)|) forn<x<n+n", 


= } = 2? 
B(x) \0 for x<2andn+n'<x<n+l, w= 2,3,..., 


where j(x) = exp{—[x(1 — x)}-'} for 0 < x < 1. Then (x) is continuously 
differentiable (but not bounded) and 


0 < — log p(x) < face) ar = | Sie ar |S o to, 


so that p lies in P and the above remarks are applicable. 

















GROUPS OF POSITIVE OPERATORS 485 


Finally we choose [U,®] to be the backward flow representation, that is 
U,™ = T_,. The infinitesimal operator for [U,®], namely A2, is now given 


(8.9) Az = —Ap» and D(A:) = D(A»). 


It is clear from (8.5) and (8.8) that D = D(A;) (\ D(A:) contains the class 
D, of all continuously differentiable functions with compact carrier. Thus D 
is dense in C)(X). For f € D we have 


(8.10) ((Ai + Ax) f](x) = B(x) f(x). 


Suppose now that A; with domain D(A;) is an extension of A; + Az (with 
domain D). We wish to show that A; cannot generate a strongly continuous 
semi-group of linear bounded operators possessing even the mildest of regu- 
larity conditions at ¢ = 0.' If the contrary were true, then there would be a 
constant w such that the resolvent R(A;A;) would exist and be bounded in 
norm for R(A) > w. In this case the semi-group [U,; ¢t > 0] generated 
by A; could be computed from the inversion formula (cf. 5, p. 239) 


Y+ ir 
(8.11) Us = tim s+ f RA; As) fda, t>0, 


t-10 288 v6 


for y > wand each f € D|(A;)?]; the integral can be taken either as an abstract 
Cauchy integral or the usual Cauchy integral for each x. For f © Dy we see by 
(8.10) that Aaf = (A: + Az) f € Do so that such an f lies in D[(A;)*]. Let 
C(f) denote the support for f € Dy and define y(f) = sup[@(x); x € C(f)]. 
Then if f € Do and R(A) > y > v(/), it is clear that 


g(x) = [A — B(x)]-'* f(x) € Do 
and hence that 
R(\; As) f = R(A; As) (Al — As) g = g. 
Applying (8.11) we obtain 


US f(x) = lim ae é“Id — B(x)]7' f(x) dd = e®” f(x) 


for all ¢ > 0. Finally since Dy is dense in Cy(X) and U,™ is assumed to be a 
bounded operator, we must have U,“ f(x) = exp[t 8(x)] f(x) for all f € Co(X) 
and each ¢ > 0. However this is impossible since an obvious consequence of 
this relation would be log ||U,|| = t sup, B(x) = @ for each t > 0. 





‘More precisely, we shall prove that A; does not generate a semi-group of class (A) (8). 








486 . H. A. DYE AND R. S. PHILLIPS 


REFERENCES 


1. S. Banach, Théorie des opérations linéaires (Warsaw, 1932) 
2. S. Eilenberg and S. MacLane, Cohomology theory in abstract groups 1, Ann. Math. (2) 
48 (1947), 51-78. 
3. W. Feller, On positivity preserving semi-groups of transformations on C[r;, r2], Ann. Soc. 
Polonaise de Math. 25 (1952), 85-94. 
4. P. R. Halmos, Measure Theory (New York, 1950). 
5. E. Hille, Functional analysis and semi-groups, Amer. Math. Soc. Colloquium Publ., 
XXXI (1948). 
- R. V. Kadison, A generalized Schwartz inequality and algebraic invariants for operator 
algebras, Ann. Math. 56 (1952), 494-503. 


o 








7. R.S. Phillips, The adjoint semi-group, Pacific J. Math. 5 (1955), 269-283. 

8. , Semi-groups of operators, Bull. Amer. Math. Soc. 61 (1955), 16-33. 

9. I. M. Singer, Automorphisms of finite factors, Amer. J. Math. 77 (1955), 117-133. 

10. M. H. Stone, Applications of the theory of Boolean rings to general topology, Trans. Amer. 
Math. Soc. 41 (1937), 375-481. 

11. , Boundedness properties in function-lattices, Can. J. Math. 1 (1949), 176-186. 


State University of Towa 
University of Southern California 

















| 


a 








HYPERSURFACES OF A FINSLER SPACE 
HANNO RUND 


Introduction. Certain aspects of the theory of subspaces of a Finsler 
space had been treated by the present author in earlier papers (7). These 
developments were based on an approach essentially different from the 
classical theory of Cartan (2) and subsequent writers, whose use of the element 
of support enables one to introduce the so-called “euclidean connection,” 
which effects the vanishing of the covariant derivative of the metric tensor. A 
comprehensive treatment of the theory of subspaces of a Finsler space based 
on Cartan’s point of view was given by Davies' (4). However, the present 
writer seeks to dispense with the notion of element of support in the theory of 
Finsler spaces; in fact, a Finsler space is regarded as being locally Minkowskian 
instead of locally euclidean. From this point of view it is no longer possible to 
establish a euclidean connection in the above sense. This leads to a peculiar 
new geometrical picture; for instance, we have to deal with a set of normals 
attached to a point of a hypersurface instead of a single unique normal, nor 
are the covariant derivatives of these vectors tangential to the hypersurface. 
Two distinct differential forms play the réle of the second fundamental form, 
while the number of principal directions at a point cannot be specified in the 
usual manner, due to lack of linearity. 

The purpose of the present paper is to provide an analytical background 
and an extension of the results of (7), which is mainly geometrical in character. 
Some of the theorems of (7) will be derived once more in the course of our 
analysis: this is unavoidable, but it will be found that these results appear in a 
much improved form leading to new and more comprehensive theorems, of 
which the most interesting ones appear to be the distinct forms of the general- 
izations of the equations of Gauss and Codazzi of classical differential geometry. 
For the sake of geometrical clarity we shall deal with hypersurfaces instead 
of subspaces of arbitrary dimensions. Also, we shall briefly define the basic 
concepts concerning the theory of Finsler spaces, so that the present paper 
may be read independently. 

We consider? a space F,, endowed with a local coordinate system 

Ue sce @ £20 gt 
The distance between neighbouring points x‘ and x‘ -+ dx‘ is defined by 
ds = F(x",dx"), where we make the following assumptions about the function F: 

Received August 26, 1955. This work was undertaken during the 1955 session of the 
Summer Research Institute at Queen's University, Kingston. The writer wishes to express his 
sincere gratitude to the Canadian Mathematical Congress. 


1The reader is referred also to this paper as regards the relevant literature. 
*For these and the following definitions see (8, §2). 


487 











488 HANNO RUND 


(a) F is of class C‘ in its 2m arguments; 
(b) F is positive provided not all dx* = 0; 
(c) F is positively homogeneous of first degree in the dx’; 


(d) the form g,,(x", dx”) &t/ > 0 for all & # 0 with any given argument 
dx", where we have put 
a°F (x, x’) x’' dx‘ 

ax" ax" ’ ds © 
The quantities (0.1) are regarded as the components of the metric tensor 


of F,; in view of hypothesis (c) the g,, are homogeneous of degree zero in the 
x’*. Thus we have the useful identities: 


(0.1) Bis(x, x’) = 3 


(0.2) Og is(x, x ) x" -_ dalx, x ) x" _ 0. 
Ox Ox 
The covariant differential of a vector-field X‘(x*) of F, is defined by 
(0.3) DX‘ = dX‘ + Pi, (x, dx) X"dx", 
where 
(04) Pha(x, 2’) = {ih — te Ge, 27) Mo TY gs 
; ‘ hk) (2.2") ; Ox’ jk) (2.2") 


On the other hand the covariant derivative of X‘ with respect to x* is given 
by 


1 
(0.5) or = an + Pri ‘gd 
where 
i re] re) rs] 4 
(0.6) Phin = euP%t = fii] — 4( 284 Phe + 2844 Ps, - 284 Pr.) 2 


We note that the P2¢ are symmetrical in h and k, while for the P%, this is not 
true.* Owing to (0.2) the following identities, which we shall have to use fre- 
quently in the sequel, may be shown to hold: 


(0.7) Phils, x’) x” = Pa(x, x’) x"; Pa(x,x’) x" = ‘it i 


/ (z,2") 


*, ; rk 1 
Pu (x, x’) x” x" an 1 \ xe”. 
hk (z.2’) 
We may remark that for covariant differentiation along an arbitrary curve the 
covariant derivatives of the metric tensor do not in general vanish. 


%E. T. Davies pointed out that the Py are, in fact, identical to the ri of Cartan (2). 
However, since we do not use the element of support, our covariant derivative stil! differs from 
that of Cartan. For instance, Ricci’s Lemma holds in Cartan’s theory, while this is not the 
case for the locally Minkowskian theory. 








or 


on 


ot 


the 





HYPERSURFACES OF A FINSLER SPACE 489 


1. The projection factors. Consider a hypersurface F,_, of F,, defined 
by the equations 
(1.1) xt = x‘(u*), 
(throughout this paper Greek indices run from 1 to m — 1; Latin indices from 
1 to m) such that the matrix 

|Xall, 

with 
act 
au" ’ 
is of rank m — 1. In general we have to consider a set of unit vectors‘ normal 
to F,_, at a given point P of F,_,. These are defined firstly by the solutions n*‘ 
of the equations 
(1.3) m Xa = g1;(x,n) n’X, = 0. 


These solutions are normalized by means of the relation 


(1.2) Xi = 


(1.4) F(x,n) = 1 or gy;(x,n)n'n’ = i. 
The second set is defined by the solutions n*‘ of the equations 
(1.5) gus(x, x’) mn 'X2 = 0, 
where x’ is an arbitrary direction tangential to F,., at P. Clearly the n** 
are functions of this direction: n*‘ = n**‘(x, x’). To each direction x’ tangent 
to F,_, at P corresponds such a vector n*‘; the totality of these vectors at P 
defines a cone of directions, which we call the normal cone. Again we suppose 
the n*‘ to be normalized by means of the relations 
(1.6) F(x, n*(x,x’))=1 or  gyj(x, m*(x, x’)) n**(x, x’) n*!(x, x’) = 1. 
For the sake of brevity we shall write 

* ’ ’ * , 
(1.7) mx, x’) = gij(x, x’) m “(x, x’), 


where it is to be noted that this does not represent the covariant components 
of n**. We shall also have occasion to use the function defined by 


(1.8) (x, x’) = gi;(x, x’) n**(x, x’) n**(x, x’). 
From equations (1.3), (1.5), (1.7) and (1.8) we deduce that 
(1.9) ns (x, x’) = W(x, x’)[cos(n, n’)).m, 


where the Minkowskian cosine is defined (6, p. 62) on the indicatrix 
F(x", ") =1 of the Minkowskian tangent space to F, at P. The metric tensor 
of F,_,; is given as usual by 


(1.10) Sua (u, u") = ges(x, x’) X2X5, 
where the directional argument u’* tangent to F,_, satisfies 
(1.11) x’ = Xiu’. 








‘For details concerning these definitions see (7, Part I, §4). 











490 HANNO RUND 


Similarly we may also define a tensor independent of direction by putting 
(1.12) Yas (u) = ges(x, m) X2X}. 


Corresponding to (1.10) and (1.12) we have to define two sets of inverse 
projection parameters, respectively dependent and independent of direction: 


(1.13) Xi(x, x’) = gij(x, x’) g(u, u’) Xb; 
(1.14) Vilx) = gij(x, n) y(u) Xb. 
It follows that 

~ (1.15) n(x, x’) X%(x, x’) = 0; n‘¥7=0; 
while 
(1.16) Xi(x, x’) Xs = 83; VIXS = 45. 


For an arbitrary direction x’ tangent to F,_, at P we may decompose the 
metric tensor as follows: 


gis(x, x") = gap (u, 0”) XTX + ma(u, u") Xin; + ma(u, u") X4 nt 


— = 
+ x(u, u’) m, m5. 


On multiplying this equation successively by n*‘, X,’, it follows from the 
preceding identities that 


(1.17) gaslx, x”) = gap(m, 1%) X%(x, x”) X8(x, x’) + ; ni (x, x’) s(x, x’). 
Similarly 
(1.18) B1s(x, m) = Yas(u) Y{ y}+ 1 Nj. 


On multiplying (1.17) and (1.18) by g(x, x’) and g(x, m) respectively, we 
find that 


(1.19) XB(x) X%(x, x’) = st — me x’) ni (x, x’), 
and 
(1.20) X5(x) Y4(x) = 85 — n*n;. 


From these two equations together with (1.13) and (1.14) it follows that 


(1.21)  g(u, u’) X8(x) X4(x) = g(x, x’) — dn (x, 7) n(x, x’), 


y 
(1.22) ¥ (u) XE X$h = g(x, n) — n'‘n’. 
Let 
(1.23) X'(x") = Xi U*(u") 


be a continuous and continuously differentiable vector field tangent to F,-1. 
The induced covariant derivative 


a 78 
(1.24) Ut, = 2S + Plu, uw!) U 


—S 























————s 





HYPERSURFACES OF A FINSLER SPACE 491 


of U*® with respect to F,_,; is defined by the projection onto F,_, of the co- 
variant derivative X‘, of X‘ with respect to F,: 


(1.25) gis(x, x’) XZXEX', = go,(u, u’) U%,. 


On substituting for the covariant derivative from (0.5) and differentiating 
(1.23) with respect to u*, a simple calculation yields 


(1.26) Pas.(u, u’) = gey(u, u’) P*3s(u, u’) 
‘ a" | 
= gale, ) xi a? 


It is easily verified by means of (1.26) that under a transformation of the 
coordinates u* of F,_,, the quantities (1.24) form the components of a tensor 
in the sense indicated by their indices. Also, the P3 are symmetric in their 
lower indices. We remark that the induced connection coefficients need not 
necessarily be identical with the intrinsic coefficients of F,_;, ie. the connection 
coefficients which are derived from the gas and their derivatives in a manner 
analogous to that in which the Pe! are derived from the g,, and their 
derivatives.’ However, if equation (1.10) is differentiated with respect to 7, 
one may obtain the transformation laws for the intrinsic Christoffel symbols 
[a8, y] of the first kind (7, p. 369 (3.5)). On multiplying this equation with 
u'*u’8, one finds in view of (0.2): 


2s . 
(af, y] u"*u” = ee + {i bes xs) u’*u". 





+ Pi. (x, x’) Xe x3). 


ou"du 
Thus from (0.7), (1.11) and (1.26) it follows that 
(1.27) [oB, Yiwu" u” = Pros» (u, u’) uu’ 


analogously to (0.7). 


2. Normal curvature of the hypersurface. Let C: x‘ = x‘(s) be an arbit- 
rary and continuously differentiable curve of F,_,; passing through a given point 
P(x‘) of F,-1. The parameter s is the arc-length. The unit tangent vector 
dx'/ds to C at P is denoted by x’‘, and throughout this section we shall suppose 
—unless otherwise stated—that the directional arguments of all subsequent 
functions are x’'. At P we have 
(2.1) n,x'' = 0. 


In (7) we defined the normal curvature R-'(x, x’) of F,_, for the direction 
x’* by putting 
(2.2) R(x, x!) = my — = — x 
having arrived at this definition by considering variations of the unit normal 
in the neighbourhood of P. In the present section we shall derive a new 


5This contradicts to some extent a statement made by the writer on p. 364 of (7). However, 
in view of (2.7) the results of (7) continue to hold. Compare also E. T. Davies (4) 








492 HANNO RUND 


expression for (2.2), using a process entirely different from that of (7), our 
purpose being to find a more useful expression for the second fundamental 
form. Let us consider for the moment the special case for which the vector 
field U* of equation (1.23) coincides with the tangent vectors u’* of C. Using 
(1.23) and (1.27) we find 
bu’ _ dé a by, , 2 ic} r 

) ——- ae EP au *. rv 
(2.3) is Ux a + P's 4 + By uu", 
where 6 denotes covariant differentiation in F,_,. But on differentiating (1.2) 
along C we have 


dx’* a*x* yao du’ 
(2.4) a "wear * +5 


On substituting (2.3) and (2.4) in the expression for Dx’'/Ds according to 
(0.3) we thus obtain (taking into account (0.7)) 


Dx'* ow .. bu’ » es xx! 
— = Fan ul? + Xe — — Xa P 5, wu + Po xe . <* 


or, using (1.11), 


‘ Dx'* i u’® 8 ¢ ul! 
(2.5) yy = Xog uu” + Xa is 
where we have put 
Pe 
(2.6) Xig = soey — XI Py + Phe XE XS. 


The expression (2.6) suggests that the Xi, may be regarded as generalised 
covariant derivatives of the X; with respect to ug (as defined in (9), p. 124 
for the case of a Riemannian space). This is indeed the case. Using the trans- 
formation properties of the connection coefficients, it can be shown by direct 
transformation that the Xd, have in fact the tensor properties as indicated 
by the position of their indices. We shall, however, omit this somewhat tedious 
calculation. Also, we note that they are symmetrical in their lower indices. 
On multiplying (2.5) by m, and taking into account (1.3), we find that the 
normal curvature (2.2) may be expressed in the form 


(2.7) R(x, x’) = 0, Xb, uu”. 


On multiplying (1.25) by u’ it follows that the tangent vector u’* to C 
satisfies the relation 





rt 1a 
(2.8) Xi - 


Now let us consider the geodesic IT of F,_; tangent to C at P. A simple calcula- 
tion shows that the Euler-Lagrange equations reduce to: 


du’* {a} a 
(") + By uu’ = 0. 





oT 








oo! ——_ 





| 
| 
| 








ee 








HYPERSURFACES OF A FINSLER SPACE 493 


Hence it follows from (2.3) that (éu’*/és)r = 0, i.e. the geodesics are the 


autoparallel curves. Thus in view of (2.8) the principal normal of [ (regarded 
as a curve of F,) satisfies 


’ Dx''* 
£15(X, x Iz (z") = 0. 
Comparison with (1.5) shows that therefore 


Pe) at 
(2.9) (Be re Pr ; 


where p;' is the curvature of [ (with respect to F,) since m*‘ is a unit vector 
by (1.6). In contrast to the properties of hypersurfaces of locally euclidean 
spaces, p;' does not coincide with the normal curvature as defined by (2.2); 
hence it is called the ‘“‘secondary”’ normal curvature, denoted by (R*(x, x’))~'. 

If we apply equation (2.5) (which holds for all curves of F,_,) to the geodesic 
I, we have, in view of the remarks made above and (2.9): 


(2.10) Xp ul*u”® = n°'/R* (x, x’) 
and since this equation does not involve second derivatives it holds for all 
curves of F,_, tangent to C at P. If we multiply (2.10) by m,;, we have 
R-(x, x’) = ny n**/R* (x, x’), 

and hence 
(2.11) R* (x, x’) = cos(n, n*) R(x, x’), 
in agreement with (7, p. 200) where this relation had been derived by a 
generalisation of Meusnier’s theorem. 

Furthermore, equation (2.10) suggests that the Xd, are normal to F,_.. 


This is easily proved as follows. Using (1.11) and (1.10) we may write equation 
(1.25) in the form 


15 Xy Xi P*!, = g1,Xy (oe, + P*i, x2 xs) . 
In view of (2.6) this becomes 
(2.12) gis(x, x’) al XXs = O. 
Comparing this with (1.5) we see that we may write 
(2.13) Xfg(u, u’) = O2e(u, u’) n**(x, x’). 


The 2%, will be called the coefficients of the secondary second fundamental 
form, as distinct to an alternative, equally useful definition which we shall 
introduce presently. This nomenclature is justified by the fact that equations 
(2.10) and (2.13) together yield 


(2.14) (R* (x, x’))* = 0, (u, u’) u’*u" 


so that this fundamental form describes the secondary normal curvature, 
the u’* being components of a unit vector since s is the arc-length of C. 











494 HANNO RUND 


If we multiply (2.13) by m, we have 


(2.15) nXas(u, u") = Qap(u, u’), 
where we have put 
(2.16) Q.s(u, u’) = 22, (u, u’) cos(n, n). 


We shall regard the Qu, as the coefficients of the alternative second fundamental 
form. This equation is in agreement with the corresponding relation given in 
(7); we have to show, however, that the 2.5 as defined by (2.15) are identical 
to the Q.s defined in (7) according to the relations 


(2.17) Qap = — 3 (Mr.x + Mer) x X$. 
Since the P2 are symmetric in 7 and k, this definition reads: 
on On * 
Qa = — a( 25x: + os x) + n; Pus Xo X5. 
But on differentiating (1.3) we have 
Itty yr _ ax” 
aue Xa "» autaue’ 


so that the above expression becomes 
(2.18) Qap = My, (es, + Prix? xs) , 

u Ou 
In view of (1.3) we may insert the additional term 

-XxiPZ 

into the bracket without changing the value of the right-hand side; thus the 
right-hand side of (2.17) becomes 

Ni x 


as a result of (2.6). Thus the definitions (2.15) and (2.17) are equivalent 


3. Principal directions. From (2.17) and (2.15) we deduce that the normal 
curvature of F,_,; in the direction du* at P is given by 


8 

Qas(u, u’) du“du 
i a. 

Lag(u, u’) du*du 


(3.1) (R(x, x’))~* = 


The (m — 2)-dimensional locus Q.,(u, u’)u’*u’® = 1 in the hyperplane spanned 
by the X{ in the Minkowskian tangent space to F, at P represents a generaliza- 
tion of the Dupin indicatrix. Principal directions are defined to be directions 
which are determined by those points on the Dupin indicatrix whose (Minkows- 
kian) distance from the centre of the indicatrix assumes an extreme value 
relative to neighbouring points. In other words, principal directions are given 
by extreme values of gag(u, u’)u’*u’® subject to the condition Qas(u, u’)u’*u"® = 1, 
where u* is being kept fixed. As a result of (3.1) principal directions are 


Se 








— 4 —— 








HYPERSURFACES OF A FINSLER SPACE 495 


therefore directions for which the normal curvature assumes extreme values. 


According to the multiplier rule we therefore have to seek solutions of the 
equations® 


0 a a 
577 Baa(u, 1") uu” + A(Qag (ue, 4") uu" — 1)} = 0, 
or, in view of (0.2), 


(3.2) Qgay(u, u’) u™ + 202, (u, u’) u’/* + » We a uu” = 0. 


This. equation may be simplified considerably. We note first that 
rs] ({ 1 \ me) _ a8 hk m| x’ "x rk 4 2 (lh, m] x’ 
ax" \ iS ax g 


as a result of (0.2). Since g"g,;, = 5,", this reduces to 


a shook \ rh m ir sky. rh 
(3.3) seal {ihe . ) ws A {its “a ate! 1 | 
= 2P'» x" 


in virtue of (0.4) since dg,,/dx’' is symmetric in all its indices (equation (0.1)). 
Now we differentiate (0.7) with respect to x’'; using (3.3) we obtain 

aP>, ‘ 

mE xe "* 4. OP*E l® = DP tx! 

Ox’ 

On observing (0.7) once more we deduce immediately: 
i , 

(3.4) — Rr gy" = 0. 
But if we differentiate equation (2.18) with respect to u’’, we have, since m, 


is independent of direction, 


OQas8 OP re rh xurk 
- =n Xa Xa; 
au’? ™* ay 


and on multiplying this result by u’*u’® we may deduce thet 
(3.5) 


having taken into account (1.11) and (3.4). Thus equation (3.2) reduces to 
Lay(u, u’)u’* = — rQ,,(u, u’)u’. 


Multiplying this result by wu’? it follows from (3.1) that \ = — R, so that the 
equation for principal directions finally reads: 


(3.6) Lay(u, u’)u’* = R(u, u’)Q.,(u, u’)u’*, 

*In (7) principal directions were defined similarly, but for a second fundamental form whose 
coefficients are independent of direction. It is shown here that the method applies also to the 
general case. 











496 HANNO RUND 


where (R(u, u’))-' is the normal curvature corresponding to a solution of 
(3.6). 

Since this is not a linear eigenvalue problem, nothing can be said about the 
number of possible independent solutions. However, let us assume for the 
moment that at least two independent solutions 


sa sa 
U1), U2) 
corresponding to two distinct normal curvatures R,,;~' and R,)~' exist, this 


assumption being geometrically feasible. Writing down the two equations 
(3.6) corresponding to each of these solutions and multiplying them by 


u(2) and ui) 
respectively, we have 
, ary a y 
(3.7a) Bay (U, Ui) UCU) = Rey Qay(u, Udy) UG)UG, 
together with 
, sa ry a ry 
(3.7b) Lay (U, U(2)) Mia) = Ri) Qay(U, (2) U¢2y4C1). 


Since the u’* are unit vectors, the left-hand sides of these equations are by 
definition the Minkowskian cosines 


cos (uci), %2)) and cos (1:3), u,1)) 


respectively (6). Thus on subtraction we find the following relations between 
principal directions: 





(3.8) cos (ui), a) — COS(us), Mi) _ [ Ste ach _ Qay(u, i) | re 7 


te (1)4(2). 
RwRe Rw Rw 


This is a generalisation of the orthogonality relations between principal 
directions of hypersurfaces of locally euclidean spaces. For if the cosine were 
symmetric in its directional arguments, and if the coefficients of the second 
fundamental form were independent of direction, it would follow from (3.8) 
that principal directions would correspond to conjugate directions of the 
Dupin indicatrix, and hence either (3.7a) or (3.7b) would lead to the law 
of orthogonality. 


4. The covariant derivative of the unit normals. For a large number 
of problems it is essential to have a convenient expression for the covariant 
derivative of the various unit normal vectors. In this section we shall obtain 
such formulae and use them in a discussion of a few simple applications. 
In the next section these relations will be essential in the derivation of the 
generalised Gauss-Codazzi equations. A difficulty peculiar to locally Min- 
kowskian spaces is the fact that the covariant derivative of the unit normals is 
not tangential to the hypersurface: it is due to this fact alone that our formulae 
are more complicated than the corresponding relations in Riemannian 
geometry. 














HYPERSURFACES OF A FINSLER SPACE 497 
We define the tensor 
‘ ‘ k an‘ *i oh wk 
(4.1) ne=npXpg = at Pun Xo. 
By writing 
(4.2) Cin (x, x’) = Liza (X, n) 


for the covariant derivative of the g,,(x, m) with respect to x* (8, §2), we find 
that covariant differentiation of (1.3) with respect to u* gives 


(4.3) Q4 = — CipX5Xin’ aed £15(x, n)Xn’s. 
Now let us decompose /, as follows: 
(4.4) n’s = Ap Xi + vgn’, 


where the coefficients are to be determined as follows. Multiplying (4.4) by 
Bis(x, m) Xe 
we find in virtue of (1.3), (1.12) and (4.3) 


(4.5) Yas Ap = — Map — Cin X Xan’. 
Also, on multiplying (4.4) by n, we see that 
(4.6) Wg = ny; ns. 


Differentiating (1.4) covariantly with respect to ug and taking into account 
(4.1) and (4.2) we have 


(4.7) ei Cin n'‘ n’ X%. 
On substituting (4.5) and (4.7) in (4.4) we deduce that 

n’, = " Qap X5 — Cum n" X5ly" Xi Xi + jn‘ n’). 
Hence in view of (1.22) we have the desired formula: 


(4.8) wp = — 7" O4Xi-— Cun’ Xie" (x, n) — 4n‘ n’). 


At first sight one might be led to suspect that the term Cy," implicitly 

involves the derivatives dn'/du*: this is not the case, however, since the term 
a8 ae 

containing these derivatives vanishes identically in view of (0.2). Thus the 
covariant derivative (4.8) depends only on positional coordinates and the 
direction x’‘ along which we are differentiating, i.e. it is the same for all curves 
of F,.; which have a common tangent x’' at the point under consideration. 

It is also necessary to evaluate the covariant derivatives of the generators 
n** of the normal cone. By a process similar to the one described above, we 
find after some calculation: 











498 HANNO RUND 





ae iil n* +(x, x’) 
(4.9) ns,= v(x, x’) g” (u, #) Ma Xh + B46, 2) ¥) Vs 
*; , 
can thee)" ea), 
where 
(4.10) Cine, x’) = g15,2(x, x’). 


In contrast to (4.8), equation (4.9) suffers from the drawback that the 
term ys on the right-hand side involves the derivatives of the tangent x’ 
to the curve along which we are differentiating, so that (4.9) depends on the 
curve under consideration. 

As a first application of these formulae let us consider principal directions 
as defined in the preceding section. From (4.1) and (4.8) we have 
(4.11) Pa = — 7" Ou” Xi- Cun’ x""[g""(x,n) — $n‘ n’]. 

Using (4.2) we may write 


Dn , Dn’ 
Ds = Cm s° x" + £13(X, n) ‘Ds 





and on substituting from (4.11) in the last term of this equation, we find after 
some simplification 





(4.12) mu = — gi,(x,n) Xiv" Qe u” + hn; (Cag n’ n’ x"). 
Hence from (1.3) and (1.12) we deduce in particular: 
(4.13) xi = — Quy u” 


If (R(x, x’))—' is the normal curvature corresponding to a principal direction 
x’ of F,-1, we have from (3.6) and (4.13): 
, Dn , —1 ’ 8 
Xap. = — (R(x, x’))” gas(u, u’) u”, 
or, if we denote the covariant components of the unit vector representing the 
principal direction by Ya, 
i Dn, 
Ds 


Thus the projection of the covariant differential of the unit normal onto 
F,-1 coincides with the principal direction. This is a generalisation of the 


(4.14) x = — (R(u,u’))'y 


classical formula of Rodrigues.’ It should be noted, however, that in contrast 
to the classical theory, the covariant differential of m‘ has a normal component, 
in general non-vanishing, even in the case of principal directions. 





7In (7), Part II, a similar result was obtained for principal directions corresponding to the 
alternative second fundamental form. 


ee 

















HYPERSURFACES OF A FINSLER SPACE 499 


Another simple application of the equation (4.8) is the generalisation of an 
important formula due to Bianchi (1, p. 450) concerning the deformation of 
hypersurfaces in classical differential geometry, which was later generalised 
by Davies (3, p. 291) in his theory of the second and third fundamental forms 
of subspaces of a Riemannian space. At each point x‘ of F,_; we construct the 
unit normal n‘; the locus of points whose coordinates are x‘ + en‘ (where « 
is an arbitrarily small quantity) form a new hypersurface F,_;. Let P(x‘), 
Q(x‘ + dx‘) be two neighbouring points of F,_;, a distance ds apart. There 
will be two neighbouring points P’(x‘ + en‘), Q’ (xt + dx‘ + «(n‘ + dn'‘)) 
on F,,_, corresponding to P, Q respectively, where dn‘ corresponds to the change 


in n‘ as we pass from P to Q. If we denote the distance between P’ and Q’ 
by d3, we have 


d* = gy,(x* + en", dx” + edn") (dx‘ + edn‘) (dx! + edn’) 
Expanding the expression on the right-hand side and dividing by ds, we 


find 
-\ 2 J , 
(2) -l= e| 2eute x’) x’* om + Sela 2) yt hd “| 


or, after some rearrangement, 

~ dg ' , n| de j j\ h | 
(4.16) (“) hosed 221 tas.” * J 
where we have neglected terms involving higher powers of «. 


But from definition (0.4) we have 


ry 1 Dn’ -_ , | ae Sil hook Np im OLnm J l ( p | 
£is(x, x’) x Ds _— £is(x, x’) x ds + lags” x = 2 g ax’ \pks* x ’ 


and in view of the relation gm g = 5 it follows from (0.2) that the last 
term on the right-hand side vanishes. Thus (4.15) becomes 


» ds F , »4 Dn’ 
(4.16) (: ) — 1 = 2eg,, (x, x’) x Ds’ 


In the locally euclidean case this would simply become the formula of 
Bianchi or Davies (/oc. cit.) as a result of (2.2) and (3.1). In the present case 
the position is a little more complicated: using (4.11), (1.10) and (1.22), 
equation (4.16) may be written in the form 


=\2 
ds 3 
($ —1 = — 2elgs,(u, u’) ¥ ‘Qe u” u™ 
$ 
‘ r h yk a "y 8 yi wi i j 
+ g,;(x, x’) X, Con n' Xo u™ u(y" X Xi + 4n' n’)}. 


Applying (1.10) once rhore together with (1.17), we find after some simplifica- 
tion 


. ds\’ — 
(4.17) —]J —1 = — ewe, (u,u') uu’, 











500 HANNO RUND 


where we have put 
(4.18) cay (ue, 0) = goy(u, 0’) Y[Qae + Cum nt” XE(XE + 4X5 ya n' n’)). 


It appears, therefore, that in the general case the Q,, do not possess all the 
essential properties which one may attribute to them in Riemannian geometry. 
Nevertheless, it would not be feasible to introduce the w,., of (4.18) as the 
coefficients of an alternative second fundamental form since these quantities 
are not symmetrical in their lower indices. 


5. The equations of Gauss and Codazzi. In order to find the desired re- 
lations between the coefficients 2.3, of the second fundamental form of F,_,; 
and the curvature tensor of F,, it is necessary to express the XJ, in terms of 
the unit normal vector nm‘ in a manner analogous to equation (2.13). We 
therefore define a new set of quantities w.g by means of the equations 


(5.1) Xia = Qag n' + wag. 


It is simple to derive an explicit expression for the wag. Using (2.13) and (2.16) 
we see that (5.1) may be written as 


(5.2) wig = agin‘ sec(n, n) — n'‘}. 
If we decompose the vector m‘ in the form 
(5.3) n' = gn‘ — yu" X: 


we find on multiplication of this equation by n, that ¢ = sec(n, n*) in virtue 
of (1.3); and similarly, on multiplying (5.3) by 3 and taking into account 
(1.15) and (1.16), we see that 
(5.4) w= —n' Xi. 
The vector u*(u, u’) thus expresses the difference between the unit normal 
vectors n‘ and n*‘(x, x’). From (5.3) and (5.4) we finally deduce that 
(5.5) wap = Dag uw’ Xi, 
where y’ is given by (5.4). 

It may be verified by direct calculation that the process of generalised 
covariant differentiation® leads to the identities 
(5.6) Xb — Xp = Roy Xi — Roar X2X5 Xi, 
where the subscripts 8 and y on the left-hand side indicate covariant different- 
iation with respect to u® and u7 respectively, and where R‘,,, and R*,,, re- 
present the curvature tensors (8, p. 91 (3.7)) of F, and F,_,;. From (5.6) and 
(5.1) we therefore have 





*Throughout this section the directional arguments are the components of a vector x’ 
tangent to the hypersurface, corresponding to the direction along which the covariant different- 
iation takes place. 





— ee 




















HYPERSURFACES OF A FINSLER SPACE 501 


(5.7) X35 Reg, = Rei X2X$Xz + n* (Qas.y — Day.) 
+ Wapy — Ways + Qap 2'y — Day N's 
where «{,, denotes the generalised covariant derivative of w{, with respect 
y 
i. In this equation we substitute the values of »‘, and »‘, as given by equation 
(4.8). After some factorisation, equation (5.7) finally reduces to the form 
Xi[Riagy — 7'* (Gay Dep — Map Mey)] = Ries XaX5X; 
(5.8) + (wasy — ways) — g(x, m) Com m'(QagXy — Quay XS) 
—n[Qay.p — Qab.y — $Crn 2" n'(Qag X% — Qey X5)). 
We multiply this equation by g,,(x, m) Xi. In view of (1.12) and (1.3) we 
thus obtain 
1 Raby — (Gay Qyg — Dap My) 
(5.9) = gas(x, m) Rinses Xa XpXyXi — Cy "(Gag Xz — May X5) Xi 
+ gis(x, m) Xi (wes, — ways). 

If, on the other hand, we multiply equation (5.8) by g,,(x, )n’ we find, after 
taking into account (1.3) and (1.4) and suitable rearrangement of indices: 
B1s(x, m) Rien’ X2X5Xy = (Qay.p — Qasr) 
+-4C py 2m (Qag Xp — Nay X$) — G1s(X, 2) ’ (wap — ways). 


Equations (5.9) and (5.10) represent the generalisations of the equations 
of Gauss and Codazzi of classical differential geometry. On comparing these 
equations with the corresponding equations (5, p. 162 (4.11) and (4.12)) of 
Riemannian geometry, we see that the essential differences (apart from the 
impossibility of contracting terms with different directional arguments) lie 
in the additional terms involving the C,, and the w{,. This, again, is owing to 
the fact that the covariant derivative of the unit normals is not tangential 
to the hypersurface and that different normals have to be taken into account. 
However, it is possible to remove the terms in (5.9) and (5.10) involving the 
wig, and to replace these terms by expressions involving the Q.s. If we write 
down the generalised covariant derivative of (5.5) and use (5.1) we find 


(5.11) way = (Qa we + Qap uy) Xi + Dap u’ (Qe, 2’ + wiy). 


Hence, on observing (1.12) and (1.3), we thus obtain 


(5.10) 


gis(x, 2) XE wodey = yra(Qas.y w+ Dap iy) + gis(x, m) XE why Dag uw. 


In the last term of this expression we substitute from (5.5), so that (1.12) 
may be applied once more. Thus the !ast equation becomes 


(5.12) gis(x, m) Xf way = Yra(Qap.y uw? + Dap Hy + Dey Das uw ). 








502 HANNO RUND 


Also, it follows from (5.5) and (1.3) that. 
nN; was = 0. 


Hence on multiplying (5.11) by m, and taking into account (1.3) and (1.4), 
we find that 


(5.13) nN; Way = Qas Qs, p. 


On substituting from (5.12) and (5.13) in (5.9) and (5.10) respectively, we 
obtain the equations of Gauss and Codazzi in their final form: 


Van Riapy — (Gay yg — Dap Dry) = Bis(x, m) River X2X5X} XS 
(5.14) —Cyy n'(QagX*% — Qe X5) XZ 
+ yratn’ (Qas.y — Dery.p) + (Dap Hy — Dery u's) 
+ (Qe, Map — Dey Dep) w* u'), 
together with 
Bis(x, m) Ring m’ X2XGX; = (Qay.p — Qasr) 
+ 4Cyy 2’ 2" (Qag XF — Qay X5) — (Dap Ly — Dey Dag) yw’. 


It is clear that different forms of the Gauss-Codazzi equations are obtained 
when one considers the secondary second fundamental form 


(5.15) 


* 
Que u’ u’” 


together with a given generator n*‘(x, x’) of the normal cone; i.e. when equa- 
tion (2.13) is used instead of (2.15). The calculation proceeds along similar 


lines to the one outlined above, and will therefore be omitted. Instead of 
(5.7) one obtains 


g’*(u, 0) Xi[Reasy(u, u’) — (Qe, Nes — Veg Dy) 
= R's, (x, x’) X2XEX! — g™ (x, x’) Conn”! (05 X* — 2%, X3) 
(5.16) 


1 
a n* [(Q2y.s = 225.7) ™ 2y oe n™ n* "(0% x = af, X5) 


re * 

- ay Cast» — Qa .s)). 
On multiplying this equation by g;,(x, x’) X,’ we obtain in virtue of (1.10) 
and (1.6): 
5.17) Raatr(u, a!) — V1 (Qe, Diy — Me G,) 

= Rywi(x, x’) XiX2X$Xy — Cyan” 'Xi(Qag X} — Oe, X5). 
In analogy to (5.15) we find similarly by means of (1.6) and (1.8): 
’ = 

(5.18) Rael. x’) m X2XGXy = ¥(Me.6 — Op.1) 


. 


+ 3Cing 2” "(Og X* — O5,.X3) — 3 (OSs Vy — Oey V8). 


pottiiliememensnsatll —EE—— 














HYPERSURFACES OF A FINSLER SPACE 503 


Equations (5.17) and (5.18) thus represent alternative forms of the 
generalisations of the equations of Gauss and Codazzi. As regards the study of 
imbedding problems ic is probably advantageous to base any such discussion 
on equations (5.14) and (5.15). 
|) In conclusion we may remark that the equations of Gauss and Codazzi 
in Riemannian geometry are known to be dependent on each other to a certain 
extent: whether this is true also for the general case is still an open question. 


REFERENCES 


. L. Bianchi, Lezioni di Geometria Differenziale, vol 11, parte II (Bologna. 1924). 
. E. Cartan, Les espaces de Finsler, Actualités 79 (Paris, 1934). 
. E. T. Davies, On the second and third fundamental forms of a subspace, J. London Math. Soc. 
12 (1937), 290-295. 
Subspaces of a Finsler space. Proc. London Math. Soc. Ser. 2, 49 (1945), 19-39. 
5. L. P. Eisenhart, Riemannian geometry (Princeton, 1926). 
6. H. Rund, Differentialgeometrie der Minkowskischen Raume, Arch. d. Math. 3 (1952), 
60-69. 


on 











| 7. The theory of subspaces of a Finsler space, Part 1: Math. Z. 56 (1952), 363-375; 
Part II: Math. Z. 57 (1953), 193-210. 
8. On the analytical properties of curvature tensors in Finsler spaces, Math. Ann. 127 
(1954), 82-104. 
9. C. E. Weatherburn, Introduction to tensor calculus and Riemannian geometry (Cambridge, 
1938). 


University of Natal, South Africa 




















ON THE ZEROS OF SOLUTIONS OF SECOND-ORDER 
LINEAR DIFFERENTIAL EQUATIONS 


P. R. BEESACK anp BINYAMIN SCHWARZ 


Introduction. In §1 of this paper we consider the complex differential 
equation 


(1) u’'(z) + g(z) u(z) = 0, lz} <1, 


where q(z) is a regular function in the open unit circle. We shall give a lower 
bound for the non-Euclidean distance of any pair of zeros of any non-trivial 
(i.e., not identically zero) solution u(z) of (1). This theorem (Theorem 1) is a 
generalization of a recent theorem of Nehari (7, Theorem I) quoted below. 
The first part of our proof will use a complex technique already used elsewhere 
(1, Theorem 2.1). However, for the final step in the proof of this theorem 
we need a result on the (real) zeros of the (real) solutions of the real differential 
equation 


(2) y"(r) + M(r) y(r) = 0, -l<r<l, 


under certain restrictive assumptions on the function M(r). This result 
(Lemma 1), whose proof we shall delay to the very end of our paper, is a 
consequence of a theorem of §2 giving a lower bound for the least positive 
eigenvalue of the real differential system 


(3) y’’ (x) + Ap(x) v(x) = 0, —_- (x0) = y(—x%0) = 0, O< x < @, 


where p(x) is a real function, continuous in —x» < x < xo, and changing sign 
only a finite number of times in this interval (Theorem 2). We shall, however, 
also obtain an upper bound for \ in the simpler, and more often considered, 
case where p(x) is a continuous function which is non-negative in the whole 
interval —x» < x < Xp. 


1. Non-Euclidean distance. In his first paper on this subject (6), 
Nehari made use of a fundamental relationship between the theory of the 
differential equation (1) and the behaviour of analytic functions f(z). If we 
denote the Schwarzian derivative of f(z) by {f(z),z} and if we set g(z) = 
${f(z), z} then the univalence of f(z) in |z| < 1 is equivalent to the fact that 
no non-trivial solution u(z) of (1) has two zeros in |z| < 1. Bearing this in 
mind, we now restate the above-mentioned theorem of Nehari for differential 
equations (instead of stating it—as in the original— as a criterion of uni- 
valence). 


Received November 24, 1955; in revised form April 6, 1956. 
504 











t 














ZEROS OF SOLUTIONS OF LINEAR DIFFERENTIAL EQUATIONS 505 


THEOREM I (7). Let q(z) be regular in |\z| < 1 and suppose there exists a 
function M(r) satisfying 


(4) lg(z)| < M(|z\), |z| < 1, 


and having the following properties: (a) M(r) is positive and continuous for 
—-l1<r<1; (b) M(—r) = M(r); (c) (1—r*)? M(r) is non-increasing as r 
varies from 0 to 1; (d) the differential equation 


(2) y"(r) + M(r) y(r) = 0, —-l<r<l, 


has a (real) solution y(r) which does not vanish for —1 <r < 1. Then no non- 
trivial (Complex) solution u(z) of the differential equation 


(1) u''(z) + q(z) u(z) = 0, ls} < 1, 


has two zeros in |z| < 1. The right-hand side of (4) cannot be replaced by an 
expression of the form CM(\z|) for any constant C > 1. 


We shall keep conditions (a)-(c), but drop condition (d). For the statement 
of our theorem let us agree to denote the non-Euclidean' segment between 
two points z; and 2 inside the unit circle by [2:22], and to denote the non- 
Euclidean distance between these two points by |[z,:z2]|. We have then 


\[z:z2]| = S44. 


where the integration is along [z;z2]. We now state 


THEOREM 1. Let q(z) be regular in |\z| < 1, and suppose there exists a function 
M(r) satisfying (4) and having properties (a)-(c) of Theorem 1 (7). Moreover, 
assume (d’): there exists a (necessarily even) solution y(r) of (2) for which 


y(a) = y(—a) = 0, Geel, 

y(r) #0 —a<r<a. 

Let u(z) be any (non-trivial) solution of the differential equation (1), and assume 
that u(z:) = u(z2) = 0, |z:1| < 1, |22| < 1, 2: # 2. Then 


(5) lIz1%]| > log 5=* = |[—-a, al}. 


We remark that the existence of an (essentially unique) even solution of 
(2) follows from condition (b) and that, in view of the Sturm separation 
theorem, conditions (d) and (d’) are mutually exclusive. 

The first part of the proof of this theorem follows closely the proof of 
Theorem I (7). That is, we assume that a non-trivial solution u(z) of (1) vanishes 
at the points 2;, 22 inside the unit circle, and consider the circle which passes 
through these two points and is orthogonal to |z| = 1; let us denote its whole 


1The non-Euclidean distance refers to the Klein-Poincaré hyperbolic geometry. 











506 P. R. BEESACK AND BINYAMIN SCHWARZ 


arc inside the unit circle by C. C is, therefore, the non-Euclidean straight line 
containing [2,22]. Without loss of generality we may assume that C lies in the 
upper half-plane and is symmetric with respect to the imaginary axis. Indeed, 
this position can always be achieved by a rotation ¢ = az, |a| = 1, which 
transforms (1) into 

ur’ ($) + a-*g(f/a) ui(F) = 0, 


where u;(¢) = u(z). But clearly a~*¢(z/a) is, together with g(z), majorized by 
M((|z|). We assume therefore that C is already in this symmetrical position, and 
denote the imaginary point of C (which may or may not lie in [2,22] by 7 8, 
0 < B < 1. The linear transformation 


a iB 

1+1282 
of |z| < 1 onto |w| < 1 carries C onto the line segment —1 < w < 1, and 
[2:22] onto a segment r; < w < rz, —1 < 11 < re < 1. Now, setting 


U(w) = (1 — ipw) o(ztit) 
we see that U(w) satisfies the differential equation 
6) U"(w) + a {2+ sp.) Uw) = 0, 
and, moreover, U(r;) = U(rz) = 0. We have, for real w, 

i- 6 - t=f,, |stie -(“+6) 
(l—iBw)’| 1+ 6’ |1-ifpw| \1+ 6w'/° 
Using (4), it follows that for real w the factor of U(w) in (6) is majorized by 
1-¢ \’ B* + w* \! 
C=) - (F435) ) 


Now, writing r instead of the real w, we compare (6) for —1 <r < 1 with 
its real majorant 


. 1-6 ) ((£+2\') o 
(7) y"(r) + (1-8, M i+ oP y(r) = 0, -l<r<l. 


We now use the fact (7; 11, Theorem 4.1; 1, Corollary 1.2) that if a real 
solution y{r) of the majorant equation (7) is non-vanishing on an interval 
rs3<r<r, —1 <47r3 <7, <1, then no (complex) solution U(w) of the 
majorized equation (6) can have two zeros inside this interval. 

For fixed 8, 0 < 8 < 1, let us now denote any pair of consecutive zeros of 
any solution of (7) by a,(8) and a2(8), —1 < a;(8) < a2(8) < 1, and set 


a2(8) dr 


oro 
a;(8) 1 _ 

















d(8) = g.l.b. 


’ 

















ZEROS OF SOLUTIONS OF LINEAR DIFFERENTIAL EQUATIONS 507 


where the g.|.b. is taken over all such pairs a;(8), a2(8). At this stage, without 
having yet used assumption (c), we have proved that 
|[z122]| = |[rar2]| > g.l-b. d(8). 
0<8<1 


However, as noted in (1, (2.3) and (2.4) for 8 = 1), assumption (c) is 
equivalent to the inequality 


ree) 5%) 


As equation (7) is therefore majorized by equation (2), it follows that a lower 
bound for |[z;z2]| is given by 


- = 
g.l.b. J rT, 
where the g.l.b. is taken over all pairs a;, a2, —1 < a; < a, < 1, of consecutive 
zeros of all non-trivial solutions y(r) of equation (2). We now use 


LemMA 1. Let M(r) fulfil conditions (a) to (d’) of Theorem 1. Let a, ao, 
—1 <a; < a2 < 1, be any pair of consecutive zeros of any non-trivial solution 
y(r) of (2). Then 


= 1 
(8) Jo Seo tog +4 


l1-—a’ 





where a is defined by condition (d’). If, in addition, (1 — r*)* M(r) is strictly 
decreasing for 0 <r <a, then we have strict inequality in (8) except for the 
case 4; = — a, a; = a. 


The proof of Lemma 1 which, as stated, will be given in §2 completes the 
proof of Theorem 1. 
We remark that for a function M(r) fulfilling conditions (a)-(c), condition 
(d’) will be implied by 
lim (1 — vr’)? M(r) > 1. 
r+l 
This will follow from the proof of Lemma 1, and may also be seen by using the 
Sturm comparison theorem for (2) and the differential equation 





" 1 + 47’ 
YO) + GI” = 0, 1>0, 
having the solutions 
(9) (1 — Py int y tog (4-2) — Cr, -— © <0 <e, 
l-r ; 


which are oscillatory for —1 < r < i. On the other hand, if 


lim (1 — r*)? M(r) <1 


r+1 


then conditions (a)-(c) are compatible either with (d) or (d’). Indeed, for 








508 P. R. BEESACK AND BINYAMIN SCHWARZ 


M(r) = 1/(1 — r*)? and M(r) = x*/4, equation (2) has solutions which do 
not vanish on —1 < r < 1 (cf. Nehari (6, Theorems I and I1)), while examples 
(ii) and (iii) below, with 

lim (1 — r*)? M(r) = 1 and 0 respectively, 

r+l 
illustrate Theorem 1, i.e., correspond to (d’). 

With respect to the sharpness of Theorem 1, we remark that if g(z) is an 
even function, real on the real segment —1 < z < 1, which attains its maxi- 
mum on the real axis for each |z| = C,0 < C < 1, and if in addition (1—r*)* 
’ g(r) is non-increasing for 0 < r < 1, then—taking g(r) as the M(r) of Theorem 
1—tthere exists a solution u(z) of (1) and zeros 2), 22 of u(z) such that we have 
equality in (5). Indeed, —a and a (defined by (d’)) are zeros of any even 
solution of (1). It follows that the inequality (5) is the best possible of its kind. 

The following examples deal with functions M(r) for which such corres- 
ponding functions ¢g(z) are readily found. 


, 1 + 47° 
(i) M(r) = Ter ; y> 0. 


In this case we have, for any pair a;, @2 of consecutive zeros of any real solution 
y(r)—given by (9)—of equation (2), |[a:a2]| = +/2y. Sharpness is shown by 
q(z) = (1 + 4y?)/(1 — 2?)*. This case was considered earlier (10, Theorem 3); 
however, the bound given there was not sharp. It is of interest to note that in 
this case, equations (2) and (7) are identical so that the result would follow 
without an application of Lemma 1. 

- a oO v(v + 1) 

(ii) M(r) = Ga —?)! de mer 
where » is an even positive integer. The even solution of (2) is, in this case, 
(1 — r*)! P,(r), where P,(r) is the Legendre polynomial of degree v. Denoting 
its least positive zero by a,, we obtain the bound log [(1 + a,)/(1 — a,)]. 
Sharpness is shown by g(z) = [1/(1 — 2?)?] + »(v + 1)/(1 — 2°). 

(iii) M(r) =" +) 


r 


’ 


where » is an odd positive integer larger than 1. In this case the even solution 
of (2) is (1 — r*) P,’(r}, where P,(r) is again the vth Legendre polynomial. 
Denoting the least positive zero of P,’(r) by 8,, we obtain the bound log 
[(1 + B,)/(1 — B,)]; q(z) is given by »(v + 1)/(1 — 2?). 

For examples (ii) and (iii) it was shown earlier (1, Corollary 2.1 and Corollary 
4.1) that no solution of (1) has two zeros in the circle |z| < a, or |z| < 8, 
respectively, a fact that now follows from Theorem 1. 


2. Bounds for the least positive eigenvalue. The statement of Theorem 
2 uses notions which are defined in the books of Hardy, Littlewood and Pélya 

















r 
Cc 
t 
C 














ZEROS OF SOLUTIONS OF LINEAR DIFFERENTIAL EQUATIONS 509 


(3, chap. X), and of Pélya and Szegé (9, chap. VII). Moreover, its proof 
relies on a theorem of these books (3, Theorem 378, and 9, p. 153). For 
completeness we shall restate this material (cf. 9, pp. 151-153), but only for 
the case needed here, i.e., for real functions defined and continuous on the 
closed segment (— xo, Xo), 0 < x» < @. 

Let f(x) and g(x) be two such functions. They are called similarly ordered 
if for each pair of points x;, x, of the above integral we have 


Uf(x1) — f(%2)]-[e(es) — g(x2)] > 0; 


f and g are called oppositely ordered if f and —g are similarly ordered. Consider 
now, for each real u, the set of points x in (—x», x») for which f(x) > u and 
denote its measure by M(u). Let N(u) be related to g as M(u) is to f. If, for 
each real u, we have M(u) = N(u) then we say that f and g are equimeasurable. 
We now quote the special case of the above-mentioned theorem as 


LemMA 2. If f, fi, fo, g, g1, and ge are real continuous functions defined on 
{(—xo, Xo), 0 < xo < @, fy and g, are similarly ordered, f, and gs oppositely 
ordered, f, f, and f, are equimeasurable, and also g, g, and g2 are equimeasurable, 
then 


Zo Zo Zo 
f Sogedx < J fedx< f Sigrdx. 
—Zo —Zo —Z9 

Let f(x) be defined as above. Let f(x), f*(x) and f-(x) be equimeasurable, 
and in addition let f*(x) and x? be similarly ordered, and f~(x) and x* be 
oppositely ordered. The uniquely defined and continuous functions f+(x) and 
f-(x) are called the rearrangement of f(x) in symmetrically increasing respectively 
decreasing order. f~(x) is an even function decreasing (i.e., non-increasing) for 
0 <x < x». The connection between f*+(x) and f~(x) is given by ft(x) = 
f-(x0 — x) for 0 < x < xo, and ft(x) = f+(—x) for —x, < x < 0. 

We may now state 


THEOREM 2. Let p(x) be continuous and not identically zero* for —xo < x<Xo, 
0 < x9 < @, and let p*(x)and p~(x) be the rearrangement of p(x) in sym- 
metrically. increasing resp. decreasing order. Consider the three differential 
systems 


(10) y’" (x) + Ap(x) (x) = 0, y(+x0) = 0; 
(10+) u(x) + At+pt(x) u(x) = 0, u(+xo) = 0; 
(10-) v(x) + A~p~(x) v(x) = 0, v(+xo) = 0; 


denote their least positive eigenvalues also by X, + and d~ respectively. Then 
A~ < A even if p(x) changes sign finitely often, while \ < X* holds if p(x) > 0. 


Proof. We shall use, in addition to Lemma 2, the minimum property of 
the least positive eigenvalue. Since the first half of this theorem deals with 


2If p(x) < 0 throughout the interval, then the differential systems (10) have no positive 
eigenvalues; p(x) is therefore assumed to be positive somewhere in the interval. 











510 P. R. BEESACK AND BINYAMIN SCHWARZ 


the polar case—where p(x) may change sign—we shall give an explicit state- 
ment of this property (5, pp. 214-215). Consider all functions* y(x) of class 
D’ on —X9 < x < Xo such that y(+x») = 0, and such that 


z 


t') 

py'dx > 0. 
=te 
Then the least positive eigenvalue of the system (10) is given by 

Zo 
y""dx 
»\ = min —*——__ , 
*z9 
py dx 
» 
where the minimum is taken over the above class. This minimum is obtained 
for a solution of the system (10), and we shall denote this eigenfunction 
corresponding to the least positive eigenvalue by y(x). We also use the fact 
(see Ince (4, p. 237) or Bécher (2, p. 176)) that this first eigenfunction does 
not vanish for —x»9 < x < Xp. 

It follows that y(x) may be assumed to be positive for —x9 < x < Xo; 
the same therefore holds true for y~ (x), the rearrangement of y(x) in symmetri- 
cally decreasing order. y~(x) vanishes, together with y(x), at +x». Moreover, 
it is easily seen that y~(x) is continuous and that its derivative may have 
discontinuities only for those values of the ordinate y~ for which y had extrema. 
Since by hypothesis (x) changes sign only finitely often, y(x) has only a 
finite number of inflection points and of extrema, and it follows that y~(x) 
is in D’. We then have 


f y""dx 
De SAB soon 


j py dx f p(y )*dx 
(11°) ote a 


\ 


> min ——* =). 
f pv dx 


To justify the first inequality sign we remark that {y~(x)}? is, together with 
y~(x), symmetrically decreasing; p-(x) and {y~(x)}? are therefore similarly 


ordered, and it follows from Lemma 2, that 


{ | py dx < { pb (y")*dx. 


*Kamke (5) states the minimum property only for comparison functions of class C’; however 
an elementary argument extends the validity of the result to this wider class of functions 























ZEROS OF SOLUTIONS OF LINEAR DIFFERENTIAL EQUATIONS 511 


That 


J y"dx > (y)"dex, y(+ xo) = 0, 


—Zo —Z9 
i.e., that under symmetrization the one-dimensional Dirichlet integral de- 
creases, follows (analogous to the two-dimensional case; see (9, Note A.3)) 
from the well-known fact that the arc length decreases under symmetrization. 
The minimum in the fourth term of (11~) is taken over all functions o(x) of 
class D’ such that v(+x»9) = 0 and such that 


z 


0 

pvdx > 0. 
~<te 

To prove that \ < A+ under the hypothesis p(x) > 0 (but p(x) # 0), let 
u(x) be a fixed first eigenfunction of (10+). Since (10*) is a symmetric different- 
ial system, u(x) is an even function of class C’, which we may assume to be 
positive for —x»9 < x < x». It follows, moreover, from (10+) that u(x) is 
concave from below and therefore symmetrically decreasing. Here we use the 
fact that p+(x) is, together with p(x), non-negative for —x» < x < x». We 


have now 

ero ero 

j u’dx | u’dx 
(117) dL? «oh... > a... 


Zo Zo 
{ p udx pu'dx 
ev —zr5 Jr 


ti 


Zo 
y’ dx 
> min ——* =, 


| . py'dx 


the first inequality sign in (11+) now following from the fact that {u(x) }? is, 
together with u(x), symmetrically decreasing so that p*+(x) and {u(x)}* are 
oppositely ordered. 

Theorem 2 includes a result announced by Pokornyi (8), and proved 
elsewhere. (1, Lemma 5.2): 


Let p(x) be continuous and non-negative on the interval —x9 < x < Xo, 
b(x) = p(—x), and let p(x) be mon-increasing for 0 <x < xo. Suppose 
y’’ (x) + p(x) y(x) = 0 has a solution which does not vanish on —x 9 <x < Xo. 
Set 

Sp(xo — x), 0< x < Xo, 
pi(x) = 4 i de 
pi(—*x), —xX%9 < x < 0; 
then the equation y,""(x) + ~,(x) yi(x) = O has a solution with the same property. 


In our notation this result is equivalent to the inequality 4+ > A~ and follows 
therefore—for non-negative p(x)—from Theorem 2 











512 P. R. BEESACK AND BINYAMIN SCHWARZ 


We finally remark that Theorem 2 can be generalized to two dimensions, 
i.e., to the equation 
Au(x, y) + Ap(x, y) u(x, y) = 0. 
We intend to deal with this and related material in another paper. 
For the proof of Lemma 1 we need an intermediate step which is a con- 
sequence of the first half of Theorem 2. For completeness, however, we shall 
also state and prove the analogous consequence of the second half of Theorem 


2. 


LemMMA 3. Let p(x) have the following properties: 

(a) p(x) is continuous for — 7 <x < @; 

(b) p(—x) = p(x); 

(c) p(x) is non-increasing for0 <x < @; 

(d’) the even solution y(—x) = y(x) of the differential equation 


(12) y’ (x) + p(x) y(x) = 0, —-2 <x<o, 
vanishes for finite x, with its least positive zero at x = a. 

Let a, 2, — © <a; <a, < @ be any pair of consecutive zeros of any 
non-trivial solution of (12). Then 
(13) a2 — a > 2a. 

If, in addition, p(x) is strictly decreasing for 0 < x < a then we have strict 
inequality in (13) except for the case a, = — a, a: = a. 

Moreover, if we keep conditions (b) and (d’) but replace (a) and (c) by (a’): 
p(x) is non-negative and continuous for — ~ <x < @, and (c’): p(x) is 
non-decreasing for 0 <x < @, then we have 
(13’) Qe Gy < 2a. 

In this case strict inequality holds in (13') (except for a, = —a, a, = a) if p(x) 


is strictly increasing for0 <x < a. 


We begin the proof of the first half of the lemma with the remark that the 
properties of p(x) imply »(0) > 0. Now take any fixed real c # 0, and consider 
the function g(x) = p(x + c) for the interval —a < x < a only, where a is 
defined by (d’). We now compare the three differential systems 


(14) y’’ (x) + dq(x) y(x) = 0, y(ta) = 0, 
(14-) v(x) + A~q~(x) v(x) = 0, v(+a) = 0, 
and 

(14°) Y"’ (x) + A\*p(x) V(x) = 0, Y(+a) = 0, 


and denote their least positive eigenvalues also by \, \~ and \° respectively. 
These eigenvalues will all exist since—at least for all constants c which we need 
to consider—each of the three functions g(x), g~(x) and p(x) is somewhere 
positive in —a < x < a. This is true of p(x) by our first remark; as far as 
q(x) is concerned (and hence also g~(x)) we need only consider constants c 
which are such that g(x) = p(x + c) is somewhere positive in —a < x < a. 




















ZEROS OF SOLUTIONS OF LINEAR DIFFERENTIAL EQUATIONS 513 


For, if p(x) < 0 for -a +c <x <¢ a+ no solution of (12) can have two 
zeros in this interval, i.e., a; and a, are not defined in this case. 

In (14-) g~(x) is the rearrangement of g(x) (considered only for —a < x < a) 
in symmetrically decreasing order. It follows from (b) and (c) that 
(15) q(x) < p(x), —aqx ga. 
If, in addition, p(x) is strictly decreasing for 0 < x < a, then we have strict 
inequality in (15) for some subintervals of —a < x < a. By using the minimum 
property for (14-) and (14°) it follows from (15) that A~ > A° with strict 
inequality if p(x) is strictly decreasing. By Theorem 2 it follows that \ > A-, 
hence we have \ > A° with strict inequality if p(x) is strictly decreasing. 

We now show that condition (d’) implies \° = 1. Let V(x) be an even solu- 
tion of (12). We have Y(+a) = 0 and Y(x) #0 for —a < x <a. Since 


f pY'dx = f Y"dx > 0 


it follows from the minimum property of the least positive eigenvalue that 
° < 1. On the other hand, if w(x) is any function of class D’ on —a < x Ca 
such that w(+a) = 0, then Y(x) # 0 for —a < x < a@ implies (cf. 1, Lemma 
1.1) 

(16) f w'"*dx > J pw'dx, 

so that A° > 1. 

It remains to show that A > 1 implies (13), and that \ > 1 implies strict 
inequality in (13). First, if \ = 1, then the corresponding eigenfunction of 
the system (14) has consecutive zeros at x = +a, and the corresponding 
solution of (12) has az — a, = (a +c) — (—a+c) = 2a so that (13) is 
satisfied. Now, suppose that \ > 1 and that a, — a; > 2a is not satisfied. 
Then for an appropriate c # 0 the solution of the equation 


y” (x) + g(x) (x) = 0, q(x) = p(x +), 
which vanishes at x = — a would vanish again at x’, where —a < x’ <a. 


Now define y:(x) by 


nie) = {7% Tass <s, 


| Of, x €x2€a. 


0< f yi'dx = f gyi dx. 


By the preceding inequality (and since \ > 1) it follows that 


f yi dx 
f qyidx 


We now have 








0< <i. 








514 P. R. BEESACK AND BINYAMIN SCHWARZ 


But this contradicts the minimum property of the least positive eigenvalue A of 
the system (14), and thus proves the first half of our lemma. The proof of the 
second half is analogous but somewhat simpler since p(x) is non-negative. 
The properties of (x) now imply that p(x) is not identically zero. We now 
compare (14) with 

(14+) u’"' (x) + At+g*(x) u(x) = 0, u(+a) = 0, 


and (14°), using now g*(x) > p(x), —a <x <a, and the other half of 
Theorem 2. In this case, p(x) > O and (d’) imply A° = 1 by a direct application 
_of the Sturm comparison theorem; moreover A+ > 1 (A+ > 1) implies (13’) 
(with strict inequality if p(x) is strictly increasing) also follows from the Sturm 
comparison theorem. This completes the proof of Lemma 3. 
Suppose now that M(r) has the properties (a)-(d’) of Lemma 1. Set g(r) = 
(1 — r*)? M(r); by (c) g(r) is non-increasing for 0 < r < a. Let y(r) be any 
solution of the differential equation 








(2) y"(r) + M(r) y(r) = 0, —-il<r<l. 
Set 
- 1l+r te 
x=}! ret l<r<l, 
and define 
2 - &—¢* 
(17) Vie) = +) ALLS). 


Y(x) is then a solution of the differential equation 


(12) | ¥" (x) + p(x) Y(x) =0, -~e<x<o, 


where 
e—eé” 
p(x) — (é + ) — 1. 


p(x) has the properties (a)-(d’) of Lemma 3. The even solutions y(r) of (2) 
transform into the even solutions Y(x) of (12). The numbers a and a defined 
by conditions (d’) are connected by 





l+a 


l—¢@’ 





= } log 


and g(r) strictly decreasing implies p(x) is also strictly decreasing. If a, 
and a, are any pair of consecutive zeros of a solution y(r) of (2), then, setting 
1+a 


a,= tlogs— 


1= 1,2, 


a, and a, will be the corresponding consecutive zeros of the corresponding 
solution Y(x) of (12). We have now 


2 a2 d as d 
Sa" = ja = J 7 ar = J =. = |[a,a]|. 




















ZEROS OF SOLUTIONS OF LINEAR DIFFERENTIAL EQUATIONS 515 


By this equality it follows that (13) implies (8). We have thus proved Lemma 
1 and with it the proof of Theorem 1 is complete. 


REFERENCES 


1. PR. Beesack, Non-oscillation and disconjugacy in the complex domain, Trans. Amer 
Math. Soc. 81 (1956), 211-242. 

2. M. Bocher, Proc. Fifth International Cong. of Math., Cambridge (1912), 1, 163-195. 

3. G. H. Hardy, J. E. Littlewood, and G. Pélya, Inequalities (2nd edition, Cambridge, 1952). 

4. E. L. Ince, Ordinary Differential Equations (Dover, New York, 1944). 

5. E. Kamke, Differentialgleichungen Lisungsmethoden und Lésungen (Chelsea, New York, 
1948). 

6. Z. Nehari, The Schwarzian derivative and schlicht functions, Bull. Amer. Math. Soc. 56 
(1949), 545-551. 

Je , Some criteria of univalence, Proc. Amer. Math. Soc. 6 (1954), 700-704. 

8. V. V. Pokornyi, On some sufficient conditions for univalance, Doklady Akademii Nauk 
SSSR (N.S.) 79 (1951), 743-746. 

9. G. Pélya and G. Szegié, Isoperimetric Inequalities in Mathematical Physics (Princeton 
1951). 

10. B. Schwarz, Complex non-oscillation theorems and criteria of univalence, Trans. Amer. 
Math. Soc. 80 (1955), 159-186. 

11. C. T. Taam, Oscillation theorems, Amer. J. of Math. 74 (1952), 317-324. 





Hamilton College, Technion (Israel Institute 
McMaster University, of Technology), 
Hamilton, Canada Haifa, Israel 











ON THE EXTENSION OF MEASURE 
BY THE METHOD OF BOREL 


L. LeBLANC anp G. E. FOX 


Introduction. This paper concerns the problem of extending a giver 
measure defined on a Boolean ring to a measure on the generated o-ring. 
Two general methods are familiar to the literature, that of Lebesgue (outer 
measure) and a method proposed by Borel using transfinite induction (4, 
49-134; 2, 228-238). The problem of Borel, to extend a finite measure on an 
algebra of sets to a measure on the generated c-algebra by way of transfinite 
induction on appropriate intermediate classes, has been solved by several 
authors (5; 1). In the present paper we propose to develop the method of 
Borel in its full generality, that is, to extend, by transfinite induction, an 
arbitrary measure, not necessarily finite, defined on a Boolean ring of sets, 
to a measure on the generated o-ring. 


1. Boolean relations. The terminology and notation of Halmos (3) 
will be used without comment. A sequence of sets { £,} will be called “‘ascend- 
ing” (descending) if 

E, < Ensi(En > Ent), e= 1,2,.... 
To indicate that a given sequence of sets {E£,} is disjoint, the symbol >> 
or + will be used instead of U. 


1.1 A sequence of sets {E,} is said to converge if 


N UzZ,= U Ne, 
n=l i=—n n=l t=n 


in which case, it converges to the limit 


lim E, = n U E, = U Nl £,. 


n=l i=—n n=l i=—n 


If the sequences {E,}, { F,} both converge, then also {E,\/F,} converges and 
lim(E£, U F,) = lim E, VU lim F,; 

similarly for the intersection, difference and complement. 

1.2 Let C denote a given class of subsets of the space: C- (C+) denotes the 

class of limits of descending (ascending) sequences of sets of C; C® denotes 

the class of limits of convergent sequences of sets of C. The letter R always 


denotes a ring of sets. R- (R*) is a lattice closed under countable intersections 
(unions), and R° is a ring; we have the following inclusions: 


R < R- < R-, R < R* < R*+, R° < R-, R®° < R*-. 
Received January 5, 1956. 
516 

















EXTENSION OF BOREL MEASURE 517 


1.3 With a given ring R is associated a transfinite sequence of rings: 
R,< Ri <R:.<...<R<..., 
where Ry = R, R, = R°._, if @ is an ordinal number of the first kind, and 
R.= UR; 


B<a 


if a is of the second kind. If w; denotes the first non-countable ordinal, Rw, = 
S(R), the o-ring generated by R. 


1.4 We say that a set F “covers” the set E if F > E. The class of sets covered 
by some set of R* is a o-ring, consequently every set of S(R) is covered by 
some set of R*. 


1.5 By a “measure on a lattice’”’ we mean a real function on the sets of the 
lattice satisfying the defining conditions corresponding to these of a (countably 
additive) measure on a ring, plus the monotone property: u(E) < u(F) if 
E < F. Since a lattice of sets contains the null set (by definition), a measure 
on a lattice is additive in the finite sense, and subtractive. 


2. Induction of the measure. The immediate object is the extension 
of a measure uz on R to a measure y® on R°. The first step will be to extend yu 
to a measure yw* on the lattice R*+, using the fact that uw is continuous from 
below on R. The second step will be to extend u*+ to u® on R®, using the fact that 
every set of R° is covered by a set of Rt. 


THEOREM 2.1. A measure u on a ring R admits an extension to a measure y* 
on the lattice Rt. 


Proof. If {En}, {Fa} (Em, E, € R) are ascending sequences coverging to 
E € Rt it is easily verified that 


lim lim p(E,, 0) F,) = lim lim »p(E£,, 1) F,). 
Since yu is continuous from below on R, 
lim »(E,,) = lim p(E,, 1) E) = lim p(lim(E£,, 1 F,)) = lim lim u(Z,, 1) F,). 
Similarly, 
lim u(F,) = lim lim p(E,, 1) F,), 


so that 
lim u(E,,) = lim u(F,). 


We may therefore define, without ambiguity, 


u*(E) = lim u(En), 








518 L. LEBLANC AND G. E. FOX 


where {E,,} is any ascending sequence of sets of R which converges to E; 
furthermore y+ on R* is an extension of the function uz on R. Let F, E be sets 
of R*+ such that F < E. Let {F,}, {£,} be ascending sequences of sets of R 
converging to F, E respectively. {E, (\ F,} is an ascending sequence of sets 
of R converging to F. Since n(E, (\ F,) < u(E,), we have in the limit, u*+(F) < 
u* (EZ). Suppose that 


E= > EAE, €R); 
- then E € Rt, and 
s*(B) = tim o( 3 £,) = tim a(B) = FS w(E). 
Suppose that 
E= ¥) EE, € R*); 
then E € R* and the E, may be decomposed: 


z= 2, Ean (Ean ER), 


so that 


uw (E) = > > (Ean) = > u*(E,). 


Thus u* on R* satisfies the conditions of 1.5. In what follows, ut denotes the 
extension to Rt of uw on R, according to this theorem. 


THEOREM 2.2. The measure u* on Rt enjoys the following properties: 
(1) If E, F € R* have finite measures, 
et(EU F) = wt (E) + wt (FP) — wt(EN FP). 


(2) ut(EU F) < ut (E) + u*(F) (E, F € R*). 


(3) at U z,) < Du (EZ) (E, € R*). 
1 
(4) ut is continuous from above on R*. 


Proof. (1): It suffices to consider, in the limit, the same relation for R. 
(2) A consequence of (1). 
(3) If E, € R(m = i, 2,...) then 


( Uz.) < p> u(E,), 
and therefore 


ut Uz.) = tim U2.) < tim aE) = Fe 














ee 














EXTENSION OF BOREL MEASURE 519 
If E, € R+(m = 1,2,...) the E, may be decomposed 


Z< 2 Fan (Enn E R), 
se that u*(E,) = > u(Exm), and 


> w(E) = EE we) >«( 0 Oz.) -.( 02). 


a=1 m= a=1 m= n=l 


(4) It must be shown that if {Z,} (Z, € R*) is a descending sequence of sets 
of finite measure (u*) converging to E € Rt, then 


lim «*(E,) = u* (EZ). 
We first prove the theorem for the case E = 0. Let « > 0 be arbitrary. 
Each E, covers a set F, € R such that 
ut(E, — F,) = wt(E,) — wt(F,) < 2-*e. 
(Since F, € R, E, — F, € Rt, and y* is subtractive on Rt.) Set 


G, = Ni Fi; 
1 


then G, € R and G, < E,, so that lim G, = 0. Consequently, lim u(G,) = 0. 
Since 


uw (E, = G,) ” ms U (EZ, ai F)) <a U (E, ae F))< pC _ Fi) <e, 
we have 


u(E,) — u(G,) <«¢, limu* (Ey, <«. 


Since ¢ is arbitrary, lim w+(Z,) = 0. In the general case, E = lim E, 
is any set of finite measure (u*+). There exists an ascending sequence {F,} 
(F, € R) converging to E. {E, — F,} is a descending sequence (E,—F, € Rt) 
converging to 0; therefore 


lm y*(E,) — lim y*(F,) = lim u*(E, — F,) = 0, 
lim w*(E,) = lim w(F,) = u*(E). 


THEOREM 2.3. If u is a measure on the ring R there exists a measure p° on R® 
which is an extension of u. 


Proof. Let E be any set of R®: if every E+ € R* which covers E is of infinite 
measure (u*), we set u°(E) = ~. Suppose now that there exists a set of Rt 
of finite measure (u*+) covering E; then there exists a descending sequence 
{E,} of sets of R*, all of finite measure (u+), converging to E. If {F,} isa 
second sequence, with the properties of {£,}, converging to E, then it is easily 
verified, as in the proof of 2.1, using the fact that u* is continuous from above 











520 L. LEBLANC AND G. E. FOX 


on R+, that lim u*+(Z,) = lim u+(F,). Hence we may define, without ambiguity, 
p°(E) = lim wt (Z,). If E, F € R®° and E < F, then yw°(E) < »°(F). For it 
suffices to consider the case u°(E) < @, u°®(F) < @, and then the proof is 
analogous to the case treated in the proof of 2.1. To prove that u°(E + F) = 
u°(E) + w°(F), it suffices to consider the case u°(E) < ©, w°(F) < @. Let 
{E,}, {F,} be descending sequences of sets of R*, all of finite measure (z*), 
converging to E, F respectively. We have 
ut(E, U F,) = wt (En) + ut (Fa) — ot (En OV Fi), 
and since lim (EZ, (\ F,) = 0 by (2.2 (4)), lim wt(E, (\ F,) = 0. In the limit 
we have the required equation. 
Suppose that 


E= > E, € R°(E, € R°). 
1 


It follows from the monotone property that 
u (E) > XL uw (Ep). 


It remains to prove the inverse inequality for the case u°(E,) < © (m = 1, 
2,...). Let « > 0 be arbitrary. For each index n there exists F, € R*, covering 
E,, such that w+(F,) — w°(E,) < 2-"e. We have, by 2.2 (3), 


(S E,) < it U F.) < u’ (Fy) < »3 u’(E,) + ¢, 
so that 
AF E,) < Ds'G), 
and the proof is complete. 
The measure p® on R° is an extension of u+ on R*+, which in turn is an exten- 


sion of » on R. In what follows, u° denotes the extension to R® of uw on R 
‘according to Theorem 2.3. Consider the transfinite sequence of rings (1.3): 


R=R, <R, <R: <...<R,.<... < Ro; 


an extension of » on R to a measure yu, on R, is called a “normal extension”’ 
if for every ordinal 8 of the first kind (0 < 8 < a), 


0 
Me = Ms-1.- 
We see directly that the normal extension y,, if it exists, is unique. The 


following theorem will serve as a lemma for the transfinite induction. 


THEOREM 2.4. Suppose that the normal extension pq exists for a given ordinal 
a. Let « > 0 be arbitrary; for every E € R, of finite measure (ua), there exists 
E* € Rot = Rt, covering E, such that pa(E*) — ua(E) < «. 
































EXTENSION OF BOREL MEASURE 521 


Proof. It suffices to prove the theorem for a of the first kind, assuming 
the theorem for a — 1. Let E € R, be of finite measure (u.), and suppose 
in the first place that E € R,_,:*+. We may express E as a union 


E= U E,(E, € R,_:), 
1 


and by the induction hypothesis there exists E,+ € Rot, covering E,, such 
that 


Ha( Ep ) — Hel(E,) < 2. 
The set 


covers E, and 
Ha(E ) — Ma(E) = Ha(Es = EB) = m | U (Ex ~~ z)) 
< m | U (Ex _ F,)) < > He(Et = E,) < €. 


The theorem is proved for R,_,*, then it is evidently also true for R,, and the 
proof is complete. 


We now prove the fundamental theorem of the paper: 


THEOREM 2.5. If u is a measure on a ring R there exists a measure i on the 
generated o-ring S(R), which is the normal extension of u. 


Proof. Let a be any ordinal such that 0 < a < w. It suffices to prove the 
existence of the normal extension y., assuming the existence of all the normal 
extensions us for 8 < a. If @ is of the first kind we set uw. = u°.-1. Suppose 
that a is of the second kind. Let E be any set of R., then there exists B < a 
such that E € Rg, and we set ua(E) = usg(E). Then yu, is defined, without 
ambiguity, as an additive, monotone function on R,, which is an extension of 


us for every 6B < a. It remains to prove the countable additivity of u.. Suppose 
that 


It will be sufficient to show that 
Ma(E) < > Ha(E,) 


for the case that u.(E£,) < ©, (mn = 1,2,...) and a@ is of the second kind. 
For each index m there exists an ordinal B(n) < a such that E, € Rac). Since 
gin) iS NOrmal there exists E,*+ € Ro* covering E, such that 


Ha( Ep ) a Ha(E,) < ze, 











522 L. LEBLANC AND G. E. FOX 
e being an arbitrary positive number, (2.4). Since U E,*+ € Rot, 


Ba(E) < uf U Et) < > ne(Et) < Y na(En) + «, 
so that 
wa(E) < 2 na(En) 
and this completes the proof. 


‘Henceforth gz will denote the (unique) normal extension to S(R) of the 
measure yu on the ring R. It follows from Theorems 2.4 and 2.5 that, « > 0 
being arbitrary, for every E € S(R) of finite measure (z), there exists a set 
E* € R*, covering E, such that g(E*) — a(E) < «. We prove next the dual of 
Theorem 2.4. 


THEOREM 2.6. Let « > 0 be arbitrary. For every E € S(R) of finite measure 
(@) there exists E~ € R-, covered by E, such that p(E) — p(E-) < «. 


Proof. It suffices to prove the theorem for R,, a of the first kind, under the 
supposition that the theorem is true for R,_;. Then the theorem is evident 
for R,_:*. Let E be any set of R, of finite measure (7): there exists a descending 
sequence {E,} (EZ, € R,_:*) of sets of finite measure (gf) converging to E. 
For each nm there exists F,~ € R- covered by E, such that 


a(E,) — p(F,) < 2c. 
Set 


H.= NF; 
so that {H,,} isa descending sequence (H, € R-), whose limit 
H= NH, 
belongs to R-, such that H, < E,, H < E. 
ne, — H,) = A U (e,- F)) <a Ue. - FD) 
< Do m(E,- Fi) <2" 


Therefore 7(E,) — 2(H,) < 2-' «, and in the limit, (EZ) — a(A) < «. 

The dual of the above method — passing by R~ instead of by R+—does not 
work in the general case. To see why, it suffices to consider the case of a set 
E € R- — R such that for every ascending sequence {E,} (EZ, € R) converg- 
ing to E from below, lim u(E,) < ©, and for every descending sequence 
{F,} (F, € R) converging to E from above, u(F,) = © (m = 1,2,...). For 
the same reason, the direct induction from R to R® by the formula 


























EXTENSION OF BOREL MEASURE 523 


lim w(E,) = w° (E) (E, € R, lim E, = E € R’) 


is not applicable to the general case. However, if u is finite end bounded on 
R, it can be shown that each of these alternative methods is applicable. 
Then, each must give the normal extension, since, in this case, the extension 
from R to S(R) is unique. 


REFERENCES 


1. J. Albuquerque, Ensembles de Borel, Portugaha Math. (1944). 

2. E. Borel, Legons sur la théorie des fonctions. Collection de monographies sur la théorie des 
fonctions publiées sous la direction de Emile Borel (Paris, 1950). 

3. P. R. Halmos, Measure Theory (New York, 1950). 

4. N. Lusin, Legons sur les ensembles analytiques et leurs applications. Collection de 
monographies sur la théorie des fonctions publiées sous la direction de Emile Borel 
(Paris, 1930). 

5. R. Neves, Sébre a construgae algebrica da teoria geral da Medida. Instituto Para a alta 
cultura, Gendro de estudas matematicas, Publ. no. 13, 22 pp. (1945). 


Université de Montréal 











EXTREMAL PROPERTIES OF HERMITIAN MATRICES 
M. MARCUS anp J. L. McGREGOR 


1. Introduction. In (1) Fan showed that if A is a Hermitian matrix 
with eigenvalues A; < ... < A, then, for k < n, 


k 
> An— $41) 
j=l 


a 
max 2) (Ax,, x;) 


| 
M 


+ 
min >, (Ax,,x,;) = 
j=l 


where x;, ..., X, run over all sets of k orthonormal (o0.n.) vectors in unitary 
n-space V. 

It is the purpose of this paper to extend this result to the compound of a 
non-negative Hermitian (n.n.h.) matrix and investigate some of the con- 
sequences of this extension. 

In the sequel tr(Z) will denote the trace of the matrix L and the Euclidean 
norm of L will be designated by ||Z|| = (tr(Z*L))! where L* is the conjugate 
transpose of L. F(L) is the convex image of the unit sphere ||x|| = 1 in the 
complex plane under the mapping x — (Lx, x). 

For 1<r<zm let V® denote the rth compound space of V. A vector 
z € V“ will be designated by 


sm2@a,A..-.ASn x,€ V, 


where the indicated product is the usual Grassmann notation for the exterior 
product (2). The inner product in V“ is defined by 


(x1 A... AXn WA... A 9) = det{ (x4, ¥))} ¢,su.....2- 


If A is a linear transformation on V to V then the induced compound of A 
on V® to V™ is denoted by C,(A) and is defined by 


C(A)ai1 A... AX, 2 AXA... A Ads. 


We lisi some of the essential properties of C,(A) that will subsequently be 
prop 1 


used (6). 
(i) C,(AB) = C,(A) C,(B). 


(ii) If A is non-singular, normal, Hermitian, unitary, non-negative, then 
C,(A) has the corresponding property. 








Received July 25, 1955; in revised form April 14, 1956. 
524 




















EXTREMAL PROPERTIES OF MATRICES 525 


(iii) The eigenvalues of C,(A) are all possible 7 products of r of the 
r 


eigenvalues of A. 
To state subsequent results more compactly we introduce some notation. 
The set of (*) distinct choices of integers satisfying 1 < i; < in <<... <i, Ck 


will be denoted by Q;, and a typical sequence in Q,, will be denoted by w. 
If x1,...,X, is a choice of k vectors in V then a typical product 


Xi A eee A Xi, € y~ 


will be denoted by x,. E,(a:,...,¢,) will denote the rth elementary sym- 
metric function of the numbers aj, ... , a: 
Tr 
E,(a:,...,@:) = >~ a;,. 
e¢Qir jel 


2. Results on Hermitian matrices. The basic result is contained in 


THEOREM 1. Let 1 <r<k <n and let A be an n-square positive definite 


Hermitian matrix with eigenvalues 0 < a, < a2 <... < ay. Then 
max >, (C,(A) Xo, %e) = E,(an,.- +» 5 Gn—e41)s 
weQkr 
min >> (C,(A) %e, %») = E,(ar,..., a) 


wtQkr 
where both max and min are taken over all sets of k o.n. vectors x;,..,X, in V. 
Proof. Set 
g(%1, eres Xp) = ps (C,(A ) Xwy Xu). 


wtQkr 


First it is clear that a set of maximizing (minimizing) o.n. vectors exist. 
This is easily seen using a standard continuity argument. If k = m then 


g(x;,..-,%) = tr C,(A) = EZ, (as, ... , Ge) 


and the result thus follows trivially whenever the number of vectors is equal 


to the dimension of the space. Now for k < m let y;,..., y, be a minimizing 
set for g. The following argument is the same if y:,..., y, is a maximizing 
set. Consider the linear subspace L of V spanned by y,..., ¥,. Let P be 


the orthogonal projection onto L. Consider the mapping PA on L to L. 
Clearly if x and y belong to L then 


ll 


(PAx, y) = (Ax, Py) 
(Px, Ay) 


(Ax, y) = (x, Ay) 
(x, PAy), 


so that PA is positive definite Hermitian on L to L. Let m,..., uy be o.n. 
eigenvectors of PA in L. Then 











526 M. MARCUS AND J. L. MCGREGOR 


g(1, se. » Vr) = ym (C,(A) Voor Yo) 


wtQkr 


Zz det { (Avs, Vir) } s. ont... 
DX det{ (PAys, ¥u)} 

DX (C,(PA) yor Yo) 
D (C,(PA) the, he) 
= g(ui,..., ty). 


At this point we prove a lemma reducing this situation to the case k = n. 


tr C, (PA) 
LD (C,(A) they the) 


LemMMA 1. L is an invariant subspace of A. 


Proof. If L is not invariant under A we lose no generality in assuming that 
Au, ¢L. Then there exists a unit vector v in the orthogonal complement of 
L such that 


p = (Au, v) ~ 0. 
We define 
us - = tpv 
V1 + # lp! 

u', = Uy, Lt ee 
where ¢ is a real number. It is easy to check that u)’,..., u, is an o.n. set. 
Since g(u:,..., 4) is a minimum for g it follows that 

© otek... uz) = 0 fort = 0 
dt P ; . 


Using the multilinearity of the Grassmann product we compute that for 
t=0 


d u,—tpv u,—tpv 
a "A Tay Pp A Ma A: A Me Tag ply A Ma «++ A Me) 


—PCA(A)vA Uy A... A Uy, ti A Uy A... A ty) 
— B(C,(A) 1 A ta A... A Ue 0A Ua A... A Uy) 


— 2\e|*T] (Aus, us;). 
ju? 
Here we have used the fact that if s,¢ > 2 and s # ¢ then 


(Au,,, Us) ” (PA Uiss Us) - 0, 


since %,..., % is an o.n. set of eigenvectors of PA on L to L. Furthermore 
it is clear that 


T] (Auz,, u,) > 0, 
j=? 

















_—_ 











EXTREMAL PROPERTIES OF MATRICES 527 


and hence at ¢t = 0 

@d , 

ae Miss + +» Me) #0 
and the proof of Lemma 1 is complete. 


The proof of Theorem 1 is now easily completed. Since L is invariant under 
A, let B be the restriction of A to L. Then B is a positive definite Hermitian 
transformation on a k dimensional subspace onto itself and the eigenvalues 


of B are some k of the eigenvalues of A, say a;,,...,a,. Thus 
B01, -- ++ Me) = » (C,(B) Yor Yu) 
@tQkr 

= tr C,(B) = Eau, see » &y) 

> E,(a1,... , a). 
Thus 

g(x, a | Xy) > E, (a, re | Gx) 

for any 0.n. vectors x;,...,X, and equality is attained by choosing a set of 
o.n. eigenvectors of A corresponding to ay, ... , a. 


Remark. Theorem 1 is true for A simply n.n.h. and can be established 
by continuity from the case A positive definite. Actually Fan’s Theorem for 
the sum can be proved in exactly the same way using only the condition that 
A is Hermitian. It is worth noting that Theorem 1 cannot be obtained directly 
by applying Fan’s result to C,(A). The difficulty arises from the fact that 
the lexicographic ordering of the eigenvalues of C,(A) does not necessarily 
coincide with the ordering by magnitude. Throughout this section we will 
assume A is n.n.h. unless otherwise stated. A result of A. Ostrowski (5) now 
follows easily. 


CoroLttaRy 1. Forl<qr<qk<n 
min E,((Ax, %1),..., (Axe, x,)) = E,(a1,... , a). 
where thé min is taken over all sets of k o.m. vectors x;,...,X, im V. 


Proof. It follows from the Hadamard determinant Theorem and Theorem 
1 that 


E,(a1, eee » &) < g(x1, eee » Xe) 
= >) (C,(A) Xe Xe) 


weQtr 


< > Il (Ax4,, Xu) 


e¢Qir sl 


= E,((Ax1, x1), ..., (A%p, %)). 


As before, the minimum is taken on. 











528 M. MARCUS AND J. L. MCGREGOR 


COROLLARY 2. Under the same hypotheses as Corollary 1, 


b 1 k r 
max E,((Ax;, %1),..., (AXx, %)) = (*)(25: asus) ; 
= 
Proof. By Fan’s result 


k 
max >, (Ax, %:) = Ey (ag, . . « 5 Gp—a41)- 


t=1 


Then by (3; Theorem 52) 
E,((Ax1, 1)... (Axe, %)) < (*)( Es Si), = +1 (Asp =))) 





k 


< (NZ); 


We must show that this value is actually taken on. This is accomplished by 
use of the following elementary lemma. 


LemMA 2. If T is a linear transformation on V to V then there exists an o.n. 
set of vectorsv,€ V,j =1,...,m,m <n such that 


(T v;,0;) = n tr(T), j=il,...,™. 

Proof. We use an induction argument to exhibit a unitary matrix R such 
that 

(R* TR) = 2 tr(T), ¢=1,...,m™. 


For m = | it is clear since m—' tr(T) € F(T). Suppose there exists a unitary 


U such that 
y Tu Ti2 
U*TU = ( . ) 
T21 T22 


with 71;, T22, r and (m — r) square matrices respectively and (T1);; = n™ 
tr(T). Then tr(T22) = (m — r)r~' tr(T) and applying the case m = 1 to T2 
we select a unitary (m — r)-square matrix S such that 
(S* T22S)u = 7 tr(T). 

Define the m-square unitary matrix V by V = diag (J, S) and set R= U V. 
This completes the induction. 

Actually, for the purposes of this proof, we need only know Lemma 2 for 
T Hermitian. In this case we can readily exhibit an o.n. set v, satisfying 


Lemma 2; let u:,...,%, be an o.n. set of eigenvectors of T and let @ be a 
primitive mth root of unity. Then set 


v po u 
‘= 7 Uy. 
j=l 1 
Returning to the proof of Corollary 2, we select y,, . . . , Yn-x4+1 Corresponding 
to the eigenvalues a, . . . , @n—x41 respectively. These span a subspace invariant 


under A and by restricting A to this k-dimensional subspace and applying 








, 














EXTREMAL PROPERTIES OF MATRICES 529 


Lemma 2 to the restricted transformation we select k o.n. vectors x;,... , X, 
such that 


1 k 
(Ax,, x,) = py Ons 


Clearly, for this choice of the x, 


E,(\Ax1, %1),..+, (Axp, %)) = (*(LS 0 - ’ 


and the proof is complete. 


Coro.Liary 3. For 1 <i; <i2 <<... Sie <M, 
k k 1 k k 
Il a,< I] Ay < (is an) ' 
j=l j=l j=l 


Proof. Let ¢, be the unit vector with 1 in the jth position and 0 elsewhere. 
Then 
(Ae,, €:;) =A tj 1; 


and the result follows from Corollaries 1 and 2. We remark that for k = n 
we have the Hadamard determinant inequality. We also note that the lower 
inequality is contained in (5). 


Coro.tiary 4. Jf A is an arbitrary matrix with row vectors A,,...,As 
then forl<qi<... Sip en 


k k 1 <t . bk 
Ile, < I] [|A «,]| < >» An— j+1 
j=l 1 j=l 


j= 
where a1 <...& a, are the non-negative square roots of the eigenvalues of 
A* A. 
Proof. Apply Corollary 3 to A* A. 


CorROLLARY 5. Assume A satisfies the conditions of Theorem 1. Let 0 < w; 
<... < wo, be k non-negative numbers k < n. Then 


k 


k 
min i] (Ax, x,)% = [] aft-**. 
J= 


j=l 


Proof. 


k k k 
I] (Ax, x3)" I] (Ax,, x;)"" I] (Ax, xs)" .. « (Ade, % 
j=l j=l j=2 
k k-1 
> a; a: eee =” 
> Tai TT as 
Pp 


@k—i +1 
= I] a 


j=l 


and the latter value is clearly assumed. 











530 M. MARCUS AND J. L. MCGREGOR 


Coro.tiary 6. Jf A and B are arbitrary n-square complex matrices then 


ABI > max (141 TI a wire) } 


where 0 < a; < agi and0 < By < Busi (¢ = 1,...," — 1) are the eigenvalues 
of A* A and B* B respectively. 


Proof. ||AB||? = tr{ABB*A*} = tr{(A*A)! (BB*) (A*A)?}. Let yi,...,9n 


be an o.n. set of eigenvectors of (A* A)! corresponding respectively to a},...,a 
. Then by Corollary 5 


||AB||* = > a,(B* By,, y,) 


Ca 
. 


Ilal|-2 


> IIAll (i (B* By, 94)” 


> (141 (i By 0) ai 


The argument is symmetric in A and B and the result follows. 
THEOREM 2. Let A and B be n.n.h. with eigenvalues a, < ... <a, and 


Bi<...< Ba respectively. Let 0 0, < ...< 0, denote the eigenvalues of 
A +B. Then forr ck <n, 


E,(61, .. . , 9) > max(*) © i B,E, Se 3 
e 2X fil<.nu....ss}. 


s=0 jul 


* 
pitta <i )E (LE an) han) 
(EOL Ean) Can} 





Proof. Let x:,...,x, be an o.n. set of eigenvectors of C = A + B corres- 
ponding respectively to 6;,...,0,. Let a; = (Ax,,x,) and b; = (Bx,, x,). 
Then 

E,(61,..., 9) = E,(a1 + bi, ... , @e + dy) 


Tr 


a a “ae > Ha, To, 


ti<.. oo s=0 pe(ii<...<ig)Cw jul 


> > >> TT 8, Es(ou,.-- +01) 


weQkr s=0 


> = > I] 8; E.(a1,..., a) 


weQkr s=O0 jul 


(*) > TT 6, Baa, ...,). 


s=0 jel 




















EXTREMAL PROPERTIES OF MATRICES 531 


The result is symmetric in A and B and the first inequality follows. The 
second inequality is proved analogously. 


5. 


6 


REFERENCES 


- Ky Fan, On a theorem of Weyl concerning eigenvalues of linear transformations 1, Proc. 


N.A.S. (U.S.A.), 35 (1949), 652-655. 


. H. Flanders, A note on the Sylvester-Franke theorem, Amer. Math. Monthly 60 (1954), 8, 


543. 


. G. H. Hardy, J. E. Littlewood, G. Polya, Imequalities (Cambridge, 2nd ed., 1952). 
. A. Horn, On the singular values of a product of completely continuous operators, Proc. N.A.S. 


(U.S.A.), 36 (1950), 374. 

A. Ostrowski, Sur quelques applications des fonctions convexes et concaves au sens de I. Schur, 
J. Math. Pures Appi. (9) 31 (1952), 253-292. 

- J. H. M. Wedderburn, Lectures on matrices, Amer. Math. Soc. Colloq. Pub., 17 (1934). 


United States Naval Ordnance Test Station, California Institute 
Pasadena, California and of Technology. 
University of British Columbia. 











NON-DESARGUESIAN 
PROJECTIVE PLANE GEOMETRIES WHICH SATISFY 
THE HARMONIC POINT AXIOM 


N. S. MENDELSOHN 


1. Introduction and summary. In her papers (12) and (13) R. Moufang 
discusses projective plane geometries which satisfy the axiom of the uniqueness 
of the fourth harmonic point. Her main result is that in such geometries 
non-homogeneous co-ordinates may be assigned to the points of the plane 
(except for the “line at infinity’’) in such a way that straight lines have 
equations of the forms ax + y + 8 = 0, or x + y = 0. It is shown that the 
co-ordinates form an alternative field and furthermore, given an alternative 
field, it is shown that a non-Desarguesian geometry can be constructed for 
which the axiom of the uniqueness of the fourth harmonic point is satisfied. 

In the present paper the author attacks the same problem from a radically 
different, and which seems to him a more natural, point of view. The theory 
of nets is completely avoided. No use is made of algebraic arguments since it 
turns out that all the rules of operation of the algebraic symbols have a 
natural geometric interpretation. 

The basic idea of the argument used is the following. Projective collineations 
are defined for non-Desarguesian geometries and a projective collineation 
group as well as a unimodular subgroup are obtained. The axiom of the fourth 
harmonic point immediately leads to the result that the harmonic relation for 
four points in line is invariant under projection. This in turn leads to the facts 
that in such geometries a “‘full’’ unimodular projective collineation group 
exists and that any projective transformation between two lines can be ex- 
tended to a unimodular projective collineation of the whole plane. With these 
results as background the construction of co-ordinate systems in a line and 
in the plane are readily carried out and the usual rules of operations on the 
symbols follow quite smoothly. 

In this development the reason for the failure of the general associative law 
a(bc) = (ab)c as contrasted with the validity of the special associative laws 
(aa)b = a(ab) and a~'(ab) = b is immediately apparent. In fact it is known 
that in a classical Desarguesian geometry the law a(bc) = (ab)c follows directly 
from the result that in such geometries the standard construction of the 
point ab from the points a and 3, leads to its unique determination as soon as 
points 0, 1, © are assigned in the line: i.e., the point ab does not depend on 
the special position of points outside the line which are used in the construction. 
In contrast a failure of Desargues’ theorem leads immediately to the fact 


Received February 16, 1956. 


532 





a Lee 2 














_—ene eS” oS 


& 





NON-DESARGUESIAN PLANE GEOMETRIES 533 


that in general the position of the point ab in the line depends not only on the 
assignment of 0, 1 and @ but also on the assignment of two other special 
points outside the line. However, the axiom of the fourth harmonic point does 
lead to the fact that the points a~' and a? are uniquely determined by a once 
0, 1, @ are assigned and hence the laws: a~'(ab) = b, a*b = a(ab). 

It is possible to develop a multiplication in the line in which the product 
ao b is uniquely determined by the “‘scale’’ 0, 1 © in the line. The resulting 
algebra is a Jordan field. Unfortunately when such co-ordinates are extended 
to the whole plane the equations of straight lines do not take on simple linear 
forms which can be handled readily. 

Finally, the coliineation theory developed here can be applied to more 
general non-Desarguesian geometries. In the last section some of the results 
obtained are mentioned together with a discussion of possible directions in 
which the theory may be used and extended. 


2. Axioms for plane projective geometries. The general projective 
plane geometry is characterized by the following set of axioms. There are two 
sets of elements called points and lines respectively and one relation called 
“incidence”’ such that for any point and any line the relationship of “‘incidence”’ 
either holds or does not hold. The relationship is subject to the following three 
axioms: 

A(1). Two distinct points are incident on exactly one line; 

A(2). Two distinct lines are incident on exactly one point; 

A(3). There exist four distinct points no three of which are incident on the 
same line. 

In what follows the usual geometric terminology will be used, i.e. points 
which are incident on the same line will be called collinear, lines which are 
incident on the same point will be called concurrent and if a point and line 
are incident the line will be said to pass through the point and the point will 
be said to be on the line 

From the axioms of incidence alone very little can be proved. The results 
of importance are: 

(1) If one line contains exactly » + 1 points so does every other line and 
the total number of points is n? + + 1. 

(2) If m = p’, where p is a prime and r a positive integer, there exists 
exactly one (apart from isomorphism) Desarguesian geometry with n + 1 
points on each line. 

(3) For n = 2, 3, 4, 5, 7 the only geometries are the classical ones; for 
n = 6 there is no geometry. In (5) Bruck and Ryser establish the non-existence 
of a projective plane for a certain class of n. 

(4) There exist projective geometries for which Desargues’ theorem fails; 
in particular finite non-Desarguesian geometries exist for n = 9. 

At the later stage a new axiom will be introduced. Also in what follows some 
of the proofs used will assume that a line has at least five points. This latter 








534 N. S. MENDELSOHN 


assumption will enable us to avoid exceptional cases which in any case can 
easily be examined. 


3. Projective transformations. In this section properties of projective 
transformations are discussed, the purpose being to contrast the properties of 
Desarguesian and non-Desarguesian plane projective geometries and also to 
build up a rudimentary theory of projective collineations in a non-Desarguesian 
plane. This theory will be of fundamental importance in the subsequent 
development. 

Points A, B, C, D,... in a line m and points A’, B’, C’, D’,... on a line 
m’ are said to be in perspective from a point O if AA’, BB’, CC’,... all pass 
through O. If we think of this as a mapping A — A’, B— B’, CC’, etc., 
we will use the notation 


A,B, C,D,...2 ASB; C;Dy... 


to describe the mapping. A transformation A — A’, B — B’, CC’, etc., 
from the points of a line m to those of a line m’ will be called a projective 
transformation if the points A’, B’, C’,... are obtained from A, B, C,... 
as a result of a finite sequence of perspectivities. In the classical theory 
projective transformations are studied with reference to three properties, viz., 
Desargues’ theorem, Pappus’ or Pascal’s theorem, and the fundamental 
theorem of projective transformations in a line. Some of the more important 
properties are listed here: 

(a) In any plane projective geometry (Desarguesian or not) there is always 
at least one projective transformation which carries any three collinear points 
into any three other collinear points. 

(b) Desargues’ theorem is equivalent to its converse. 

(c) Desargues’ theorem is valid if and only if the plane is embeddable in a 
projective three-space. 


(d) In any plane projective geometry, Pappus’ theorem implies Desargues’ 
theorem. 


(e) In a finite plane projective geometry Pappus’ theorem is equivalent to 
Desargues’ theorem. 

(f) Pappus’ theorem is equivalent to the fundamental theorem which states 
that there is exactly one projective transformation which maps a set of three 
collinear points onto any set of three collinear points. 

Also to be discussed in this section are some of the consequences of the so 
called “‘little Desargues theorem’’ a statement which may or may not be 
valid in any specific plane projective geometry. Roughly speaking the “‘little 
Desargues theorem” states that Desargues’ theorem holds for those pairs of 
triangles in which the centre of perspectivity is incident on the axis of 
perspectivity. A formal statement of the “‘little Desargues theorem” and its 
converse are given below. 


‘ 





— 

















— 








NON-DESARGUESIAN PLANE GEOMETRIES 535 


LitTLE DESARGUES THEOREM. Let ABC (Figure 1) and A’B’C’ be two 
triangles such that 4A’, BB’, CC’ pass through O. Let AB meet A’B’ atC”, 
BC meet B’C’ at A” and CA meet C’A’ at B”’. If two of the points A”, B”, 
C” are collinear with O then so is the third. 





Fic. 1 


Converse of the Little Desargues Theorem. Let ABC and A’B’C’ be two 
triangles such that AB meets A’B’ at C”, BC meets B’C’ at A” and CA 
meets C’A’ at B” and let A’”’, B”’, C” be collinear. If two of the lines AA’, 
BB’, CC’ intersect at a point on A”, B”, C” then the third line also passes 
through this point. 


THEOREM 1. In any plane projective geometry the ‘‘little Desargues theorem” 
is equivalent to its converse. 


Proof. As both parts of the equivalence are proved by the same means only 
the statement ‘‘the little Desargues’ theorem implies its converse” is proved 
here. In Figure 1 using the above notation assume that A”, B”’, C” are 
collinear and that BB’ meets CC’ at O on A’’B’’C”. Let OA meet C’B” at A*. 
It is sufficient to show that A* = A’. In the triangles A B C, A* B’ C’ the 
lines AA*, BB’, CC’ are concurrent at O. Furthermore BC meets B’C’ at A”, 
AC meets A*C’ at B”. Since O, A’’, B” are collinear the little Desargues’ 
theorem implies that A*B’ meets AB on the line A’’B’’. Hence A*B’ passes 
through C” so that A* is the point of intersection of B’C’’ and C’B”’. Hence 
A* = A’, 








536 N. S. MENDELSOHN 


THEOREM 2. If Desargues’ theorem fails for a pair of triangles then there 
exists a line and a projective transformation in it, such that three points are fixed 
by the transformation and such that not ali points in the line are fixed by the 
transformation. 








Proof. The theorem actually follows from remarks (d) and (f) made pre- 
viously but a direct proof is given here because the resulting diagram is used 
later in other connections. Furthermore, the proof will be given a second 
interpretation in what follows. 

Since Desargues’ theorem fails, its converse also fails. Let PRS and P’R’S’ 
be two triangles for which the converse of Desargues’ theorem fails (Figure 2). 
Let PR meet P’R’ at E, PS meet P’S’ at A and RS meet R’S’ at O. Let A, 
O, E be collinear. Suppose PP’ and RR’ meet at Q. Since the converse of 
Desargues’ theorem fails Q, S and S’ are not collinear. Let QS meet OA at M 
and QS’ meet OA at M’. Then M # M’. Let QP’P meet OA at U and let 
R’S’ and RS meet PP’ at T’ and T respectively. Then 

s _R ee Peer 
0, M, A, U~ T, Q, P, U, nO BE, U T,Q,P',U> O, M’, A, U. 
The resultant mapping sends O-> O, A —- A, U- U, M—> M' # M. 

Theorem 2 shows that failure of Desargues’ theorem increases the number 
of projective transformations possible in a line. It will be seen subsequently 
that the effect of such failure can work in reverse in the case of transformations 
of the whole plane. 

At this point the notions of collineation and projective collineation are 
introduced. A point to point mapping of all the points of a plane is said to 
be a collineation if it is one-one, its inverse exists, if collinear points have as 
images collinear points and if collinear points are the images of collinear 
points. In the classical projective plane, where Desargues’ theorem is valid 
a projective collineation is defined as follows. The plane II is embedded in a 

















ee LT TT 








NON-DESARGUESIAN PLANE GEOMETRIES 537 


three dimensional space and a perspectivity from the plane II to a plane II’, 
distinct from II, with centre 0 not on either plane is defined by mapping the 
point A in II onto the point A’ in II’ whenever A, A’ and 0 are collinear. A 
projective collineation of the plane [I is a point-point mapping of the plane II 
which is the result of a finite sequence of perspectivities. A classical result is 
that if the co-ordinate field has automorphisms distinct from the identity 
there exist collineations which are not projective. If one is to distinguish 
between projective and non-projective collineations in a non-Desargeusian 
geometry a different approach is necessary since it is not possible to embed a 
non-Desarguesian geometry in a projective three-space. 

It is possible to define projective collineations in a non-Desarguesian plane 
by means of the notions of homology and elation and the definition now to 
be introduced has the property that when applied to Desarguesian geometries 
it yields the full projective collineation group. Let P be any point and / be 
any line. Let A and A’ be any two points collinear with P but which are 
distinct from P and are not on /. In a Desarguesian plane there is always 
exactly one projective collineation which keeps all points on / fixed and all 
lines through P fixed and which maps A — A’. If P is on / the mapping is 
called an elation and if P is not on / it is called a homology. Furthermore it 
is known that the homologies generate all projective collineations and that 
the elations generate a subgroup of collineations usually referred to as the 
unimodular subgroup. Before defining the projective collineation group for the 
non-Desarguesian case a few properties of homologies and elations valid for 
any projective plane will be developed. 


THEOREM 3. Let P be any point and | be any line. Let A and A’ be two points 
not on | but collinear with P and distinct from P. There exists at most one collinea- 
tion which keeps all points on | fixed, all lines through P fixed and which maps 
A into A’. 


Proof. The proof is independent of whether or not P is on /. In Figure 3 
the case P not on / is shown. Let B be any point not on AA’ and not on /. 
It is shown that the image B’ of B is uniquely determined. Let AB meet / at 
M. Since all lines through P are fixed B’ is on PB. Furthermore since AA’ 
and B — B’ the line AB has as its image the line A’B’. Also the point M on 
AB is fixed and hence must lie on A’B’. Hence B’ is determined as the point 
of intersection of PB and A’M. In the same way the fact that B — B’, deter- 
mines a unique image for any point on the line AA’. 


THEOREM 4. Using the notation of Theorem 3, a necessary and sufficient 
condition that the collineation described in Theorem 3 exists is that Desargues’ 
theorem is valid for every pair of triangles ABC and A'B'C’ with centre of 
perspectivity P and axis of perspectivity 1. More exactly, the condition is: if 
AA’, BB’, CC’, pass through P and AB meets A'B’ at C", AC meets A'C’ at 
B” and if C” and B” are on | then BC meets B’C’' at A" which is on 1. 











538 N. S. MENDELSOHN 


P 








Fic. 3 


Proof. Sufficiency; if Desargues’ theorem holds for all such triangles then 
the image C’ of C is the same whether obtained from A — A’ or from B — B’. 
The mapping is thus well defined and is obviously a collineation. 

Necessity; Suppose the mapping is a collineation. Let ABC and A’B’C’ be 
any two triangles in perspective from P and with B” and C” on 1. Then in 
the mapping A — A’ implies B — B’ and C — C’. Let BC meet / at A”’. Since 
BC maps into B’C’ and A”’ is fixed it follows that B’C’ passes through A”’. 

In the collineations just described the line of fixed points is called the axis, 
the point of fixed lines the centre. The notation Elat (P, 1; A — A’) will be 
used to denote the elation with centre at P, axis at / and A’ the image of A. 
Similarly the corresponding homology will be described as Hom (P,1; A — A’). 

In the projective plane if the collineation which maps A into A’ and which 
keeps all points on / fixed and all lines through P fixed does not exist we will 
say the collineation is obstructed; otherwise the collineation will be said to be 
unobstructed. We define the projective collineation group as the set of all 
collineations of the plane generated by all the unobstructed homologies and 
elations. The subgroup generated by all unobstructed elations will be called 
the unimodular subgroup and any collineation which is representable as a 
product of elations will be termed a unimodular collineation. If no homology 
is obstructed the geometry will be said to admit a full projective collineation 
group, and if no elation is obstructed the geometry will be said to admit a 





—_ 




















NON-DESARGUESIAN PLANE GEOMETRIES 539 


full unimodular group. As corollaries to Theorem 4 we have the following two 
theorems. 


THEOREM 5. A necessary and sufficient condition for a full projective collinea- 
lion group to exist is that the geometry be Desarguesian. 


THEOREM 6. A necessary and sufficient condition for a full unimodular 
group to exist is that the little Desargues theorem is valid in the geometry. 


THEOREM 7. If the little Desargues theorem is valid in a projective plane then 
any projective transformation between two lines of the plane can be embedded in 
a unimodular collineation of the plane. 


Proof. It is only necessary to show that any perspectivity between two lines 
can be embedded in an elation, since on representing a projective transforma- 
tion as a product of perspectivities the resultant transformation is embedded 
in the collineation which results from multiplying the corresponding elations. 
Let / and m be two lines intersecting at the point A and suppose the points 
of / are mapped onto those of m by a perspectivity with P as centre. Let B 
on | be mapped into B’ on m by this perspectivity. Then Elat (P, PA; B — B’) 
embeds the given perspectivity. 

It may be remarked here that Theorem 7 is not true necessarily for non- 
Desarguesian geometries for which the small Desargues’ theorem is not valid. 
In fact it is possible to exhibit a non-Desarguesian plane and a projective 
transformation in a line which is not embeddable in any collineation of the 
plane.? 

For geometries which satisfy the little Desargues’ theorem, it is not generally 
true that there exists a projective collineation which maps any set of four 
points, no three of which are collinear into any other such set of four points. 
In this case the following weaker theorem is valid. 


THEOREM 8. Let II be a projective plane for which the little Desargues’ theorem 
is valid. Let A, B, C, D and A’, B’, C’, D’ be two sets of four distinct points such 
that A, B, C are collinear and D is not on ABC and A’, B’, C’ are collinear and 
D’ is not on A’B'C’. There exists a unimodular collineation which maps A — A’, 
B— B’,C—C’ and D— D’. 


Proof. By statement (a) of this section there is at least one projective 
transformation which maps A — A’, B — B’, C-—+ C’. Embed this transfor- 
mation in a unimodular collineation U. Let U map D — D”. If D” = D’, the 


1The geometry described (15, pp. 383-384) by Veblen and Wedderburn has this property. 
Using their notation the points of Figure 2 may be assigned coordinates as follows: P(0, 1, 1), 
S(2 + 2j, 1, 0), RO, j, 1), P(, 2, D, S’G, 7, D, R’(2, 1, 1) QU +7, 2+, 1), BO, 0, 1), 
A(j +1, 0, 1), 0(2 +j, 0, 1), B(L + 2j, 0, 1), U(2, 0, 1), M(1, 0, 1), M’(2j, 0, 1). The pro- 
jective transformation of Theorem 2 keeps the points (2 + 7, 0, 1), (7 + 1,0, 1), (2, 0, 1) fixed 
and maps (1, 0, 1) into (27, 0, 1). It can be verified that in this geometry the projective mapping 
in the line cannot be extended to a collineation (projective or otherwise) of the plane. 











540 N. S. MENDELSOHN 


collineation U has the required property. If D” # D’, since D is not on ABC, 
D” is not on A’B’C’. Let D’D” meet A’, B’, C’ at E. Let V be the collineation 
Elat (Z, A’B’; D’” — D’). The collineation UV then has the required property. 


4. The axiom of the fourth harmonic point. In the subsequent 
development consideration will be restricted to those geometries which satisfy 
the following axiom which will be termed the axiom of the fourth harmonic 
point. Let A, B, C be any three points in line (Figure 4). Let M be any point 
not on AB. Let CEL be any line through C distinct from AB and not passing 
through M, the point L being on MA and E on MB. Let AE intersect BL at 
R and let MR intersect AB at D. Also, let K be point of intersection of CL 
and MR. 








Fic. 4 


Points A, B, C, D related by such a diagram will be said to satisfy the 
relation H(A, B, C, D). The axiom of the fourth harmonic point may be 
stated as follows: 


A(4). If H(A, B; C, D) then D is distinct from C and is uniquely determined 
by A, B and C. 


The axiom A(4) implies that D is independent of the choice of M and the 
choice of the line CEL. It is possible to weaken the axiom to the assumption 
that D is independent of the choice of the line CEL for fixed M, but it can 
readily be shown that the weakened axiom is equivalent to the stronger form. 

Figure 4 will be referred to as the harmonic diagram. The points A, B, C, D 
may be described as follows: A and B are diagonal points of the quadrangle 
MLRE and C and D are the points where the diagonals of MLRE meet the 
line AB. Figure 4 is symmetric with respect to A and B and also with respect 


to C and D. Hence: 
TueoreM 9. H(A, B; C, D) implies H(B, A; C, D) and 
H(A, B; C, D) implies H(A, B; D, C). 














Se oe 





NON-DESARGUESIAN PLANE GEOMETRIES 541 


TueoreM 10. If H(A, B; C, D) and A, B, C, D are in perspective with 
L, E, C, K then H(L, E; C, K). 


Proof. If M is the centre of perspectivity, Figure 4 gives a construction of 
D from A, B, C. In Figure 4, MABR is a quadrangle, L and E are its diagonal 
points and K and C are the points where the diagonals meet L and E. Hence, 
Figure 4 represents a construction for K such that H(L, EB; C, K). 

Because of the symmetry of the harmonic construction between C and D 
a theorem similar to Theorem 10 holds if C is replaced by D. 


THEoREM 11. Jf A, B, C, Dand A’, B’, C’, D’ are perspective and H(A, B; 
C, D) then H(A’, B’; C’, D’). 


Proof. Let O be the centre of perspectivity and let CD’ meet OA at A* and 
OB at B*. By theorem 10 H(A, B; C, D) implies H(A*, B*; C, D’) which in 


turn implies H(A’, B’; C’, D’). As an obvious corollary, it follows that: 


TueoreM 12. If A, B, C, D, are four points in line related be A’, B’, C’, D’ 
by a projective transformation and H(A, B; C, D) then H(A’, B’; C’, D’). 


Tueorem 13. H(A, B; C, D) implies H(C, D; A, B). 








Fic. 5 











542 N. 5. MENDELSOHN 


Proof. The diagram of Figure 4 is extended to Figure 5 as follows: Obtain 
the point S as the intersection of AM and DE and the point U as the inter- 
section of BL and DE. 


E 
A 
Hence H(A, B; C, D) implies H(C, D; B, A) which in turn implies H(C, D; 
A, B). 

Theorem 13 allows us to remove the distinction between the pairs A, B and 
C, D of a harmonic tetrad. Hence it is unnecessary to indicate in the notation 
which points are capped. In what follows, H(A, B; C, D) will be replaced 
by H(A, B; C, D). 


TueoreM 14. Let A, B, C, D and A, B’, C’, D’ be two sets of four points on 
distinct lines such that H(A, B; C, D) and H(A, B’; C’'D’). Then BB’, CC’, 
DD’ are concurrent. 


Proof. Let BB’ meet CC’ at K and let KD meet AB’ at D”. By Theorem 
11, H(A, B; C, D) implies H(A, B’; C’D”). Also H(A, B’; C’, D’) and 
H(A, B’; C’D”) implies D’ = D” by axiom A(4). 

THEorEM 15. Let P be a point and | be a line not through P. Let A be any 
point. If A ts P or A is onl let A’ = A. If A is distinct from P and is not on 1 
let AP meet | in M and choose A’ so that H(A, A’; P, M). Then the mapping 
A—A’' is a collineation. 


A,B,C.D A, M, L, S&B, D, U, SE C,D,B, A. 











Be 


Fic. 6 




















NON-DESARGUESIAN PLANE GEOMETRIES 543 


Proof (Figure 6). By definition and A(4) every point A has a unique 
image. Let B be any point not on / and not in line with AP. Let BP meet / 
at N. Since H(P, M; A, A’) and H(P, N; B, B’) the lines MN, AB and A’B’ 
are concurrent. Let T be point of concurrency. If C is on AB, let PC meet 
MN at Q and A’B’ at C*. Since 


P, M, A, A’ P,Q, C,C* 


it follows that H(P, Q; C, C*). Hence C* = C’ the image of C in the mapping. 
Hence collinear points map into collinear points. 


This mapping is usually called a harmonic homology and is denoted here 
by Harm (P, J). 


THEOREM 16. A projective plane which satisfies the axiom of the fourth 
harmonic point admits a full unimodular group. 








Fic. 7 


Proof. Let P be any point on a line / (Figure 7). Let A and A’ be in line 
with P. It is now shown that Elat (P, 1]; A — A’), exists. Choose N so that 
H(P, N; A, A’). Then by Theorem 15, the product of Harm(N,/) and 
Harm(A’, 1) is a collineation. Now Harm(N, i} Harm(A’, 1) maps A — A’ 
and keeps all points on / fixed. Let B be any point not on / and not on AA’. 
Let Harm(N, /) map B — B” and Harm(A’, /) map B” — B’. Let BB” meet 
l at Q and B’’B’ meet’! at R. Then H(B, B”; N, Q) and H(B’, B”; A’, R). 
By Theorem 14, BB’, A’N, QK are concurrent. Hence BB’ passes through P. 
This implies Harm(N, /) Harm(A’, 1) = Elat(P, 1; A — A’). 

In (6, 5.28) Coxeter gives the above construction for the Desarguesian plane. 











544 N. S. MENDELSOHN 


As a corollary to Theorems 6 and 16 the following result is obtained. 


THEOREM 17. The little Desargues’ theorem is valid in any projective plane 
which satisfies the axiom of the fourth harmonic point. 


The converse of Theorem 17 is also true but a proof is not given here. A 
proof can readily be obtained from the observation that an elation with Q 
as centre and which maps A — B and B — C has the property that H(Q, B; 
A, C). 


5. The addition of points in a line. In this section it will be shown that 
addition in a line can be defined in such a way that the points of the line, 
except for one point (the point at “‘infinity’’) form an abelian group with the 
further property that to each point a in a line there is a point $a such that 
a + ja = a. 








Fic. 8 


Let / be any line and let © and @ be any two points on this line (Figure 8). 
Let P and Q be any two points in line with ~ but not on /. Let a and b be any 
two points on / distinct from 0 and @. Join OP, aP, 6Q and let E be the point 
of intersection of OP and bQ. Join © E to meet aP at F and join QF to meet 
l at a + b. The point a + 5 is determined by a, 5 and the four points 0, ©, 
P, Q. We refer to these latter four points as a scale and denote it by 
{0, ©; P, Q}. 


THEOREM 18. The point a + b is independent of the points P, Q used in the 
scale. In other words, addition in the line is completely determined by the points 
0, ~. 


Proof. In Figure 8, let X be the point of intersection of aP and bQ and T 
the point of intersection of OP and FQ. Let TX meet EF at N and / at V. 
Let ~ X meet QF at Z and OP at Y. From the quadrangle QPFE it follows 
that H(@, X; Y, Z). From 








Ee 











NON-DESARGUESIAN PLANE GEOMETRIES 545 


o,X,Y,Z < @, V,0,.a+5 


it follows that H(@, V; 0, a + 6). From 


T xX 
o,X,Y,Z A o, N, EB, F o,V,b,a 


it follows that H(@, V; b, a). From H{@, V; b, a) it follows that a, b, @ 
uniquely determine V and from H(@, V; 0, a + 5) it follows that ~, V and 
0 uniquely determine a + b. Hence a + 6 is uniquely determined by 0, ~, 
a and b. 


THEOREM 19. For all a and 6 distinct from ~ and 0,a +b = b+ a. 


Proof. a + b is determined by the harmonic relationships H(@, V; }, a) 
and H(#, V;0, a+ 5). V is unchanged by the interchange of a and b. Hence 
& + a is the same point as a + b. 

The construction for a sum collapses when one of the points a or } is 0. 
We will define addition in this case by a + 0 = a and 0 + b = 5, for all a, 
b distinct from @. We leave c + © undefined. 


THEOREM 20. To each a «'istinct from @ there is a point —a such that 
a+ (-—a) = 0. 








Fic. 9 


Proof. If a = 0 choose —a = 0, otherwise choose —a by the relationship 
H(a, —a; 0, ~) (Figure 9). Since 0, © are diagonal points of the quadrangle 
PQFE it is clear that a + (—a) = 0 from the construction by means of the 
scale {0, ~; P, Q}. 


THEOREM 21. To each a distinct from @ there is a b such thathb+b =a. 
We denote b by 4a. 


Proof. If a = 0 choose 6 = 0. Otherwise choose 56 by the relationship 
H(0, a; b, @). If a’ is constructed from the relationship } + b = a’ using 











546 N. S. MENDELSOHN 


{0, o; P, Q} it is clear that H(0, a’; b, ~). Hence a’ = a. The proof also 
implies that b is uniquely determined by a. 


THEOREM 22. For all a, b, c distinct from ~,a + (b+ c) = (a+ 6) +. 





ts) C) b Moc TU - 
M = a+b = b+a 
T= c+b = b+c 
Uw=at(b+c) = (@+b) +c 


Fic. 10 


Proof. By definition the theorem is true if one or more of a, 5, c are 0. Hence 
we assume all of a, b, c are distinct from 0. In the diagram only the case where 
a, b, c are distinct from each other is shown although the proof is valid in all 
cases. In Figure 10, let P and Q be any points in line with © but not on 0 o. 
Let OP meet aQ at X; ~ X meet DP at Y; QY meet 0, © at M; OY meet 
PQ at R; CR meet ~ X at Z; ZP meet 0 © at T and ZQ meet 0 @ at U. 


Then M=b+a=a+b from the scale {0, o; P, Q} 
T=c+b=b+6¢ from the scale {0, ~; R, P} 
U =c+ (a+ db) = (a + db) + ¢ from the scale {0, ~; R, Q} 
U = (6+c¢) +a=a+ (b+) from the scale {0, ~; P, Q} 
Hence (a+b) +c=a+ (b+ 0). 


We note particularly that our proof employs only the fact that addition is 
uniquely defined by the points 0, ». The breakdown of an analogous property 
for multiplication leads to a weakened associative law. Theroems 18 to 22 
may be summarized as: 


THEOREM 23. Two distinct points 0, @ ina line 1, determine an abelian group 
under addition for all the points of | distinct from «. Furthermore, to each a 
on | distinct from @ there is a 4a, uniquely determined, such that 4a + $a = a. 











1 is 
rty 
| 22 


th a 


= d. 





~ 


—_ 





NON-DESARGUESIAN PLANE GEOMETRIES 547 


THEOREM 24. Let l and I’ be two lines. If there is a projective transformation 
from | to l' which maps 0 onto 0! and @ onto ~' and if for each a on | the 
corresponding image on I’ is denoted by a’, then the mapping a-— a’ is an 
isomorphism of the corresponding additive groups, where 0', ~' determines the 
addition in I’. 


Proof. By Theorems . and 17, the projective transformation between / 
and I’ can be embedded in a collineation U. Let {0, ~; P, Q} be a scale for 
addition in / and let U map0—- 0’, » — ’, P— P’, P-+Q. Take {0’, ~’; 
P’, Q’} asa scale for addition in /’. The collineation U will then map the whole 
construction for a sum a+ 5} in / into the corresponding construction for 
a’ + Dd’ in I’. Hence in the original projective transformation from / to /’ the 
image of a + 5 is a’ + b’. Hence the mapping a — a’ is an isomorphism of 
the additive groups. 


THEOREM 25. The additive group is uniquely determined apart from an 
isomorphism (i.e. is independent of the line used and the two points 0, @ , chosen 
on it). 


Proof. This follows from the fact that there is always a projective trans- 
formation which carries twe distinct points 0, © into any other two distinct 
points 0’, @’. 


6. Some restricted Desargues’ theorems. The axiom of the fourth 
harmonic point implies various variants of Desargues’ theorem. It has already 
been shown that the little Desargues’ theorem is valid. In this section some 
other cases in which Desargues’ theorems are shown to be true. 


THEOREM 26. Desargues’ theorem and its converse are true whenever the centre 
and axis of perspectivity are so related that the line joining corresponding vertices 
of both triangles meets the axis of perspectivity at the harmonic conjugate of the 
centre of perspectivity. 


Proof. The theorem is an immediate corollary of Theorems 15 and 4. 


THEOREM 27. Desargues’ theorem and its converse is true for any pair of 
triangles TP V and T*P* V* if T*V* meets TV at a point on the line PP*. 


Proof. Assume that the triangles 7PV and T*P* V* are in perspective from 
Q and suppose that TV meets T*V* at ~ (Figure 11), on the line PP*. T*P* 
meets PT at 0. Let QO, V, V* meet 0@ at R and let P* V* and PV meet 0@ 
at a* and a respectively. From the scale {0, ~ ; P, Q} it follows that R = a + b 
and from {0, ©; P*, Q} it follows that R = a* + b. Hence a* +6 =a+6 or 
a =a*. Conversely, suppose 7PV and T*P* V* a-e such that TV meets T* V* 
at on the line PP*. Suppose also that 7*P meets 7P at 0 and PV meets 
P*V* at a and that 0, a, © are collinear. Let TT* and PP* intersect at Q 
and let QV* and QV meet 0@ at R and &* respectively. Let QT7* meet 








548 N. S. MENDELSOHN 











Q@ at b. Then from the scale {0, ~; P, Q}, R = a+ 6 and from the scale 
{0, o; P*, Q}, R* = a+ b. Hence R = R*. 


THEOREM 28. Let ABC and A’B’C’ be two triangles such that C’ is on the 
line AB. Desargues’ theorem and its converse are both true for such a pair of 
triangles. 











Fic. 12 














NON-DESARGUESIAN PLANE GEOMETRIES 549 


Proof. In Figure 12, let A”’, B’’, C” be the points of intersection of BC, 
B’C’; CA, C’A’ and AB, A’B’ respectively. Theorem 27 applied to triangles 
AA'B’; BB’A” is then equivalent to Theorem 28 for triangles ABC and 
A'B'C’. 

It may be remarked here that Theorems 27, 28 and 17 are all actually 
equivalent to the axiom of the fourth harmonic point. The same may be said 
for Theorem 26 after a slight reformulation. In what follows these theorems 
could all be avoided by arguments involving collineations similar to those 
used in §5. However, in many cases these theorems lead to more direct proofs 
and will be used in the next section. 


7. Multiplication in the line. There are several equivalent ways of 
defining multiplication in a Desarguesian plane but these may lead to in- 
equivalent definitions for the non-Desarguesian case. For our purposes the 
following definition is taken. Let / be a line (Figure 13) and let 0, 1, @ be any 
three distinct points on it. Let P and Q be two distinct points in line with 
© but not on /. The system {0, 1, ~; P, Q} is said to be a scale for multipli- 
cation in the line /, products being defined as follows. If a and } are two points 
on / distinct from 0 and @ let 1P meet bQ at R, OR meet aP at S and let QS 
meet the line / at a point c. The point c is called the product ab. Products 
a@ or ~@ are not defined. The products 0a and a0 are both defined to be 0. 
Also for all a(# @) it follows immediately that la = al = a. It is a result of 
Desarguesian geometry that ab is determined by the points 0, 1, © and is 
independent of the points P and Q. This result’ is no longer valid in the 
non-Desarguesian case as is shown by the following theorem. 





Q 
r-) 
Ss 
R 
° l 1 b r) czob oo 
Fic. 13 


THEOREM 29. If the converse of Desargues’ theorem fails for two triangles 
P, R, S and P’, R’, S’ there is a line l and six distinct points on it viz. 0,1, ©, 
a, b, M, M’ such that in the scale {0, 1, ~; P, Q}, M = ab while in the scale 











550 N. S. MENDELSOHN 


{0, 1, o; P’Q}, M’ = ab. The point Q is the point of intersection of PP’ and 
RR’. 

Proof. We refer to Figure 2 whose construction is given in Theorem 2. If 
the points O, E, B, A, M, M’, U of Figure 2 be relabelled 0, 1, 6, a, M, M’, 
respectively, the point M is actually constructed as the product ab in the 
scale {0, 1, ~; P, Q} and M’ is the product ab in the scale {0, 1, ©; P’, QO}. 

It follows from Theorem 29, and the theorems valid in a Desarguesian 
geometry, that Desargues’ theorem is equivalent to the statement that 
multiplication as defined above is uniquely determined by the choice of 
0,1, © ina line. Hilbert has shown that Desargues’ theorem implies associative 
multiplication, and it is also easy to give a direct proof that if 0, 1, © uniquely 
determine multiplication in a line then mult‘plication is associative (2, p. 79). 
In the subsequent development it is shown that in special cases the axiom of 
the fourth harmonic point determines a product dependent only on 0, 1, © 
and that in these cases the associative law is valid. 


THEOREM 30. To each poini a distinct from 0 and @ on a line | there is a 
point a“ such that aa~' = a~'a = 1. Furthermore the point a“ is uniquely 
determined by the choice of 0,1, © on the line l. 











Fic. 14 


Proof. Let {0, 1, @; P, Q} be a scale for multiplication in / (Figure 14). 
Let aP meet 10 at M, MO meet 1P at F, QF meet 0@= at a~. By construction, 








mn, 





—————— 





NON-DESARGUESIAN PLANE GEOMETRIES 551 


it follows that aa~' = 1 in the scale {0, 1, ~ ; P,Q}. To show a™ is independent 
of P and Q the construction is continued as follows: Let QF meet aP at Z; 
MO meet Qo at R, Z1 meet MF and Qo at S and Y, respectively; Let ZR 
meet 0 at E. From the quadrangle F, 1, M, Z it follows that H(P, Q; R, Y). 
Also 


P,Q, R, v=o, a",E,1 


so that H(a, a“; 1, E). Also 


— P zo 
1, £,0, © 7-1, 2,5, ¥7- F, M, S,R7-a &, i, &. 


Hence, H(1, E; 0, ~). From the relation H(1, EZ; 0, @) it follows that E is 
uniquely determined by 0, 1, © and from H(a™, a; 1, E) it follows that a~' 
is uniquely determined by a, 1, E. Hence a is determined uniquely bv 
0, 1, », a. Furthermore since the harmonic relationships which determine 
a~' from a are symmetric with respect to a and a it follows that a~‘a = 1. 

It may be noted that the point E is identical with the point -—-1 and that 
1—' = 1, and (—1)"' = — 1. 

A simpler proof of Theorem 30 could be given using a restricted Desargues’ 
theorem but such a proof would not exhibit the harmonic relationships 
connecting @ and a™". 


THEOREM 31. For all a distinct from 0 and @ and for all b distinct from @, 
a~'(ab) = b and (ba)a™ = b. 

















552 N. S. MENDELSOHN 


Proof. The theorem is obvious if b = 0. Assume b # 0. Only the relationship 
a~'(ab) = b is proved here as the proof of the other case is along the same 
lines. In Figure 15, let {0, 1, o; P, Q} be a scale for multiplication. Let 1P 
meet bQ at W, OW meet aP at V and QV meet 0@= at ab. Let aQ meet 1P at 
R and OR meet 1Q at S and PS meet 0@ at a~'. Let a~'W meet PQ at P*. 
Let U be the intersection of bQ and a~'P. The theorem will be proved provided 
it can be shown that 0, U, T are collinear. It is first shown that 1, V, P* are 
collinear. This follows from the equation aa~' = 1, using the scale {0, 1, ©; 
PP*} and Theorem 30. Consider now the triangles Pa~'1 and QWV. PQ, a'*W 
and 1 V all pass through P*. Furthermore W is on the line P1. By the restricted 
Desargues’ theorem 28 it follows that 0, U, T are collinear. Hence from the 
scale {0, 1, ©; P, Q} a~'(ab) = B. 


THEOREM 32. For all points a distinct from @ on the line | a*® is determined 
uniquely by the points 0,1, @. 


Proof. Let {0, 1, ~; P, Q} be a scale for multiplication in the line. The 
theorem is proved in three stages: (1), 5? is independent of position of P on 
o(; (2), & is independent of position of Q on  P, and (3), 5? is independent 
of which line through © is used. 


Q 








Fic. 16 


Proof of (1). In Figure 16, let {0, 1, o; P, Q} be a scale and let P* be any 
other point on ~, P, Q. Let 1P meet aQ at R, OR meet aP at W, QW meet 
Qo at a*. Let 1P* meet aQ at R* and Qa* meet aP* at W*. To show a? in 
scale {0, 1, ©; P*, Q} is the same point as a? in scale {0, 1, ~; P, Q} it is 
sufficient to show that O0R*W* are collinear. In triangles PWR; P*W*R*, 
PP*, WW*, RR* all pass through Q. Furthermore PW meets P*W* at a 
which is on RR*. By Theorem 27, 0, R*, W* are collinear. 


Proof of (2). In Figure 17, let {0, 1, ©; P, Q} bea scale and let Q* be any 
other point on » PQ. Let 1P meet aQ at R, OR meet aP at W, QW meet 0 
at a*. Let aQ* meet 1P at R*, OR* meet aP at W*. From triangles QWR; 




















NON-DESARGUESIAN PLANE GEOMETRIES 553 








Fic. 17 


Q* W* R*; QQ*, WW*, and RR* are concurrent at P. Furthermore QR 
meets Q*R* at aon WW*. Hence by Theorem 27, Q*, W*, and a? are collinear, 
so that a? is the same for the scales {0, 1, ~; P, Q} and {0, 1, ~; P, Q*}. 


Q” 








Proof of (3). In Figure 18, let {0, 1, ©; P, Q} be a scale and let m be any 
line through © distinct from 0,1, © and from ~, P,Q. Let 1P meet m at P* 
and aQ meet m at Q*. Let 1P meet aQ at R, OR meet aP at W and QW meet 
0 at a*. Let aP* meet OR at W*. In triangles WPO; W*P*Q*, WW*, PP*, 
QQ* are concurrent at R. Also PW meets P*W* at a on Q*Q. Hence by 
Theorem 27, Q*, W*; a* are collinear. Hence a* is the same for the scales 
{0, 1, ©; P, Q} and {0, 1, ©; P*, Q*}. 

It is clear that one can go from scale {0, 1, ~; P, Q} to scale {0,1, ©; A, B} 
by a series of transformations of types (1), (2), (3). 








554 N. S. MENDELSOHN 


Remarks on Theorem 32. An alternative proof of Theorem 32 which does 
not employ the restricted Desargues’ theorem can be given. This proof is 
dependent on the fact that a* is determined from a and —a by the relationship 
H(a, —a; 1, a*). Furthermore the uniqueness of a! and of a? are not indepen- 
dent facts algebraically. In fact, the equation 





1-1 

a 
yields an independent algebraic proof of the uniqueness of a*. In a problem 
in the American Mathematical Monthly (11) the author has shown how to 
express a*b (or aba the case of non-commutative multiplication) in terms of 
a and 6 using only addition, subtraction and reciprocation. The equivalent 
of this identity was used by Hua in (9) to obtain properties of division rings. 


THEOREM 33. For all a, b distinct from @, a*h = a(ab) and ba* = (ba)a. 


Q 











° 1 b e ob o =a(ab) 
Fic. 19 
Proof. In Figure 19, let {0, 1, ©; P, Q} be a scale and let 1P meet 5Q 


at Y, OY meet aP at Z*, QZ* meet 0@ at ab. Let 1P meet abQ at L*, OL* 
meet aP at L, QL meet 0 at a(ab). Let aL* meet P, Q at N. The point a’ 


























NON-DESARGUESIAN PLANE GEOMETRIES 555 


is constructed from {0, 1, @; P, N} by joining NL to meet 0@ at a*. Let 
a*P meet QL at Z. The equation a*b = a(ab) is valid provided 0, Z*, Z are 
collinear. To show this, consider triangles ZLa* and Z*L*a. ZL meets Z*L* 
at Q, Za* meets Z*a at P and La* meets L*a at N. Furthermore, P, N and Q 
are collinear and L is on the line Z*a. By Theorem 28 Z, Z* and 0 are collinear. 


THEOREM 34. For all a, b, c distinct from @, a(b+c) = ab+ ac and 
(6+ cla = ba + ca. 








x 
E A 
Y 
Cc 
1?) bo «a co b 1 c bec - 
ba +cas(bec)o 
Fic. 20 


Proof. Both statements are proved along the same lines so only the second 
is proved here. The theorem is obvious if any one of a, b, or c is 0. Hence 
assume @, 5, ¢ are all distinct from 0. In Figure 20, let {0, 1, ©; P,Q} be ascale 
for multiplication in the line 0, 1, @. Let 1P meet aQ at A, 0A meet bP at X, 
QX meet 0@ at ba. Let cP meet 0A at Y, YO meet 0@= at ca. Let 0A meet 
PQ at C; bC meet ~ Y at E; ba, C meet ~ Y at D; DQ meet 0= at ba + ca. 
Let EP meet 0m at b+ c; (b+), P meet 0A at F; FQ meets 0@ at 
(6 + c)a. All points on the line 0@ have been properly labelled with respect 
to addition and multiplication. The equation (b + c)a = ba + ca will be valid 
if F is on the line QD. In triangles POX; EDC, PQ meets ED at ~. PX meets 
EC at b, and QX meets DC at ba. Furthermore ~, } and ba are collinear and 
C is on PQ. By Theorem 28, F, Q, D are collinear. 

The main results of this section and the last may be summarized in the 
statement: 


THEOREM 35. Under addition and multiplication the points on a line which 
are distinct from @ form an alternative division ring, for any fixed scale 


{0, 1, ~; P, Q}. 


In her paper (12) Moufang has shown that starting with an alternative 
division ring of characteristic distinct from 2 one can set up a non-Desarguesian 








556 N. S. MENDELSOHN 


plane projective geometry which satisfies the axiom of the fourth harmonic 
point. 

In the next few theorems some properties of multiplication are developed 
which are interesting in themselves but which are not needed for the main 
development. 


THEOREM 36. Let {0, 1, ~; P, Q} be a scale for both addition and multipli- 
cation in a line I (i.e. if {0, 1, ©; P, Q} is a multiplicative scale the set {0, ©; 
P, Q} is taken for the additive scale). Let m be any other line in the plane, and 
let 0’, 1’, @’ be three arbitrary points of m. There is a scale for addition and 
multiplication in m such that the algebra of points in | is isomorphic to that of 
the points in m. 


Proof. Take any projective transformation which maps 0-0’, 1-1’, 
co — «’ and embed this in a unimodular collineation of the plane. Let P’ 
and Q’ be the images of P and Q in this collineation and take {0’, 1’, ~’; P’, Q’} 
as a scale for the line m. Let the image of a on / be a’ on m. The mapping 
a—a’ is the required isomorphism. This follows from the fact that anv 
construction for a sum or product using the scale {0, 1, ©; P, Q} is mapped 
by the collineation into the corresponding construction using the scale 
iv, , o: F, CW}. 

Since multiplication in the line as developed here is not uniquely determined 
by the scale points in the line, a natural question which one may ask is: can 
multiplication be defined in such a way that products are uniquely determined 
by the scale points 0, 1, © in the line. An affirmative answer is given in the 
next theorem. 


THEOREM 37. The point aob defined as aob = }(ab + ba) is uniquely 
determined by the points 0, 1, © in the scale, and the points of the line distinct 
from form a Jordan field under the operations of + and o. 


Proof. 


and the uniqueness follows from Theorems 18, 20, 21, and 32. The Jordan 
associative law a? 0 (boa) = (420 5) oa follows from a direct computation 
using the definition of a o b. 

Geometrically the relationship between ab and a o 3 is given by the harmonic 
relationship H(ab, bu;aob, ©). 

In spite of the intrinsic nature of Jordan multiplication, i.e., its independence 
of the position of the points P and Q, it is not of much use since when co- 
ordinates are introduced into the whole plane it does not lead to linear 
expressions for the equations of straight lines. 

THEOREM 38. If ab = ba the point ab is uniquely determined by 0, 1, ©. 


Furthermore if ab = ba for all a and b in any line the same is true for any other 
line and the geometry is Desarguesian. 




















—_ - — 


—<_—<— —— 





NON-DESARGUESIAN PLANE GEOMETRIES 557 


Proof. lf ab = ba then ab = ao}, so that by Theorem 37, ab is uniquely 
determined by 0, 1, ©. By Theorem 36, if ab = ba for all ab in any line the 
same relationship holds for any other line. By a previous remark the uniqueness 
of ab for all a and 6 is equivalent to Desargues’ theorem. As an immediate 
corollary the following theorem is true. 


THEOREM 39. Any alternative division ring of characteristic different from 2 
in which multiplication is commutative is a field. 


Remark. Tie interest here is that the proof is almost entirely geometric. 
The only algebraic relationship used was 


a+ob/\* a—b\* ab+ba 
tami it ae 


and even this may be dispensed with. It is the author's belief that a completely 
geometric proof of the fundamental theorem of alternative division rings, 
namely; that every non-associative alternative division ring is a Cayley 
division algebra over its centre is not an unreasonable expectation. 


8. Co-ordinates in the plane. At this point we introduce non-homo- 
geneous co-ordinates to all the points of the plane except those on one line— 
the line at infinity. The procedure is straightforward except that some care 
must be exercised in order to arrange that addition and multiplication in the 
various lines are consistent. The reason for this, of course, is that the points 
0, 1, © are not sufficient to determine uniquely the multiplication in a line. 
However, Theorem 36 will be used as a basis for connecting the algebras of 
the various lines. 

Let O, U, V, L (Figure 21) be four points in the plane no three of which 
are collinear. Let UL meet OV at T, VL meet OU at M, OL meet UV at W 
and MT meet UV at P. We are not interested in assigning co-ordinates to 
the points on the line UV although we will label some of these points. To 
every other paint, a pair of numbers (x, y) will be assigned. First co-ordinates 
are assigned to O, M, U, W, V, T, L as follows: O(0, 0); M(1, 0); U(@, 0); 
W(@, o); VO, ~); T(O, 1): LCI, 1). Second, every point on the line O, L, 
W will be assigned co-ordinates of the form (a, a) and our rules of addition 
and multiplication will apply to the first co-ordinate. On O, L, W take the 
scale { (0,0), (1, 1), (@, ©); U, V} to determine addition and multiplication 
of the first co-ordinates. Third, every point on the line O, M, U will be assigned 
a co-ordinate (k, 0) the point (k, 0) being defined as the intersection of the 
line joing V to (k, k) and the line OU. In the same way points on the line 
OV will be assigned co-ordinates (0, m) where (0, m) is the intersection of 
the lines OV and U(m, m). Let X be any other point of the plane and let 
UX meet OV at the point (0, y) and VX meet the line OU at the point (x, 0) 
Assign to X the co-ordinates (x, y). 








558 N. S. MENDELSOHN 


P 








0(0,0) M (1,0) 


Fic. 21 


The transformation Elat {V, OV; L—+ M} maps the point (6, 6) on OW 
into (6, 0) on OU. Furthermore it is easily seen that this elation maps U — P 
and V — V. Hence by Theorem 36, using the scale { (0, 0), (1, 0), (@, 0); 
P, V} addition and multiplication in OU is consistent with addition and 
multiplication in OW where addition and multiplication in OU is applied to 
the first co-ordinate. In the same way Elat {U, OU; L-+T} maps (0, 6) on 
OW into (0,5) on OV. Also V-+P and U-—U. Hence, using the scale 
{(0, 0), (0, 1), (0, @); U, P} addition and multiplication in OV is consistent 
with that in OW. 


THEOREM 40. Every line through (0, 0) has an equation of the form 
x — yB = 0. 


Proof. The lines OV and OU (using the notation of Figure 21) have 
equations x = 0, and y = O respectively. The line OW has equation x — y = 0. 














—<_, << 


























NON-DESARGUESIAN PLANE GEOMETRIES 559 


Let OY be any other line through O (Figure 22). Let OY meet UL at the point 
(8, 1) and let (x, y) be any other point on OY. Let U, (x, y) meet OW at 
(y, y); V, (x, ¥) meet OW at (x, x); V, (8, 1) meet OW at (8, 8). From the 
scale of multiplication {(0, 0), (1, 1), (@, @); U, V} in the line OW and 
using just the first co-ordinates, Figure 22 shows that the product y@ is the 
point x. Hence x — y8 = 0. 


THEOREM 41, Any line which does not pass through (0, 0) and which is 
distinct from UV has an equation of the form x — yB — a = Dory — y = 0. 


Vv 


a 


0(0,0) (y,0 (4,0) (x,0) U(ee ,0) 





Fic. 23 


Proof. In Figure 23, using the same notation as in Figure 21, if the line 
passes through U and meets OV at (0, y) its equation is y — y = 0. If it 
passes through V and meets OU at (a, 0) its equation is x — a = 0. Now 
suppose the line passes through (a, 0) and let it meet UV at Y. Join OY. 
By Theorem 40, OY has an equation x — y8 = 0. Let (x, y) be any point 
on the given line. Let U, (x, y) meet OY at (y8, y) and let V, (y8, y) meet 
OU at (yB, 0). In OU addition is defined for the first co-ordinates and using 
the scale {(0, 0), (~, 0); Y, V} Figure 23 represents a construction of 
x = a+ y8. Hence x — y8 — a = 0. 

It has now been shown that every line in the plane distinct from UV has 
an equation of one of the forms x — y8 — a = 0, y — y = 0. 


9. Concluding observations. It has been the point of view of this paper 
to relate the theorem of Desargues to the notion of a projective collineation, 
the basic theorems being 3, 4, 5, 6, 7, 8 where failure of Desargues’ theorem is 
related to non-existence of certain collineations. In the case of geometries 
satisfying the axiom of the fourth harmonic point the success of the method 
is due mainly to the fact that a full unimodular group exists. 








560 N. S. MENDELSOHN 


For more general non-Desarguesian geometries this approach can be used 
to give more or less precise information concerning the way in which Desargues’ 
theorem breaks down. This information could be conceivably used to classify 
non-Desarguesian geometries. We illustrate what is intended by an example. 

Let F be a finite ‘“‘near-field’’, i.e. a finite set of elements which satisfy all 
the axioms of a field except the right distributive law and the commutative 
law of multiplication. Such near-fields exist and their complete determination 
has been carried out by Zassenhaus in (16). Following Veblen and Wedden- 
burn (15) a projective plane geometry can be constructed from F as follows. 
A point is any one of the following three types of triplets; (1, 0, 0), (a, 1, 0) 
or (6, c, 1) where a, 6, c are in F. Actually any triplet (a, 6, c) (except (0, 0, 0)) 
may be regarded as a point provided one identifies (a, b, c) with (pa, pb, pc) 
where p + 0. Note however that (a, 5, c) is not identified with (ap, bp, cp). 
A line is defined as any set of points satisfying an equation of one of? 
x+ya+2)=0,y+2c=0,2=0. 

Such points and lines do form a plane projective geometry. It is easily 
verified that for such a geometry the following transformations are 
collineations: 


Il 


px’ = o(x) + o(y)a + o(2)bd. 


py’ = o(y)e. 
p2’ = ¢(z)d. 
and 
px’ = (x) + o(y)a + o(2)bd. 
py’ = o(z)c. 
pz’ = o(y)d. 


where a and b are arbitrary elements in F; p, c, d are any elements of F 
distinct from 0, and the mapping k — ¢(k) is an automorphism of F. 
Among such collineations the transformation 


px’ x + yA + 2B. 
a 


p2’ = 2g. 


*It is important to note that the equation xa + yb + zc = 0 is not in general the equation 
of a line. The footnote on page 383 of the Veblen-Weddenburn paper (14) erroneously assumed 
that the equation 

x1li+sy +91 +7 +2= 


represented a line. The statement to which their footnote referred is nevertheless correct. 

















on 











NON-DESARGUESIAN PLANE GEOMETRIES 561 


for varying A and B in F represents all elations with centre (1, 0, 0) and 
arbitrary axis through (1, 0, 0). In other words no elation with centre (1, 0, 0) 
is obstructed. By Theorem 4 this can be translated into a statement concerning 
Desargues’ theorem as follows: 


THEOREM 42. In any Veblen-Weddenburn geometry based on a near-field 
the little Desargues’ theorem holds for all pairs of triangles in perspective from 
the point (1, 0, 0). 


The author has shown (in some work not yet published) that for all such 
geometries based on near-fields whether finite or infinite the point (1, 0,0) is 
the only point with this property. Hence for such geometries every collineation 
keeps fixed the point (1, 0, 0). A question of interest is whether there are any 
other non-Desarguesian geometries with a point so specialized. 

Another direction in which the investigation may be carried out is suggested 
by the following consideration. In Desarguesian geometry it is a fundamental 
property that for any two sets of four points no three of which are collinear 
there is a projective collineation which carries the first set into the second. 
It follows that if two systems of co-ordinates are set up based on these two 
tetrads the co-ordinates of any point in the plane based on the first tetrad can 
be expressed in terms of the co-ordinates based on the second tetrad by making 
use of the equations of the collineation. In the case where the Desarguesian 
property is weakened to the axiom of the fourth harmonic point this situation 
is no longer valid. Instead the most we can call on is Theorem 8. Nevertheless, 
given four points no three of which are collinear co-ordinates in the plane can 
be set up. If for two such tetrads it happens that a projective collineation 
exists mapping the first set onto the second it would be possible to relate both 
systems of co-ordinates once the equations of the collineation were determined. 
On the other hand if the two tetrads are not connected by a projective 
collineation the most that can be said is that the co-ordinates come from the 
same alternative field. There would be no way in which the co-ordinates with 
respect to one tetrad could be related to the co-ordinates of the second. We 
could then define two tetrads as conjugate if they are joined by a projective 
collineation. It is the author’s conviction that a study of the manner in which 
the geometry breaks down into conjugate classes of tetrads would lead to a 
geometric proof of the fundamental theorem of alternative division rings. 
At present the resu!ts aie too meagre to give any real information. Of course 
this notion of conj:zate tetrad could be applied to any non-Desarguesian 
geometry. However, in the general case co-ordinates chosen from two distinct 
non-conjugate tetrads need not even belong to isomorphic algebraic systems. 

Pickert (14) has published a book containing most of the known results 
concerning non-Desarguesian planes. 











562 N. S. MENDELSOHN 


REFERENCES 


1. R. Baer, Homogeneity of projective planes, Amer. J. Math. 64 (1942), 137-152. 
2. H. F. Baker, Principles of Geometry, vol. i (Cambridge, 1929). 
3. R. H. Bruck et al., Contributions to geometry, Amer. Math. Monthly, 62 (1955), no. 7, 








part II. 

4. and E. Kleinfeld, Structure of alternative division rings, Proc. Amer. Math. Soc. 2 
(1951), 878-890. 

5. and H. J. Ryser, The non-existence of certain finite projective pianes, Can. J. Math. 1 
(1949), 88-93. 


6. H. S. M. Coxeter, The Real Projective Plane (New York, 1949). 
7. M. Hall, Uniqueness of the projective plane with 57 points, Proc. Amer. Math. Soc. 4 (1953), 
912-916. Correction, 6 (1955). 
8. , Projective planes, Trans. Amer. Math. Soc. 54 (1943), 229-277. 
9. L. K. Hua, Some properties of a S field, Proc. Nat. Aca. Sci. 36 (1949), 533-537. 
10. N. S. Mendelsohn, A group theoretic characterization of the general projective collineation 
group, Trans. Roy. Soc. Can. 40 (1946), Section III, 37-58. 
11. , Solution to Problem 4062, Amer. Math. Monthly, 61 (1944), 171. 
12. R. Moufang, Alternativkérper und der Satz vom vollstindigen Vierseit (D, 9), Abh. Math. 
Sem., Hamburg, 9 (1932), 207-222. 
, Die Schnittpunktsdtze des projektiven speziellen Fiinfecksnetzes in ihrer Abhangigkeit 
voneinander (das A-Netz), Math. Annalen 106 (1932), 755-795. 
14. G. Pickert, Projektive Ebenen (Berlin, 1955). 
15. O. Veblen and J. H. M. Weddenburn, Non-Desarguesian and non-Pascalian Geometries, 
Trans. Amer. Math. Soc. 8 (1907), 379-388. 
16. H. Zassenhaus, Ueber endliche Fastkérper, Abh. Math. Sem., Hamburg, 11 (1936), 187-220. 











13. 


University of Manitoba 


eo 











DOUBLE TRANSITIVITY IN FINITE 
PROJECTIVE PLANES 


T. G OSTROM 


Introduction. A projective plane is characterized to a certain extent 
by the amount of transitivity it possesses. This amounts essentially to saying 
that the plane is characterized by its group of collineations. The trans ‘ive 
planes that have been most thoroughly studied are the cyclic planes (&; 10; 
12; 13; 14).' It is believed that all finite cyclic planes are Desarguesiar , but 
it has not been proved that this is the case. Infinite cyclic non-Desarguesian 
planes are known to exist (10). Closely allied are cyclic affine planes of 
Hoffman (11). Zappa (17) has extended some of the notions of cyclic planes 
to the case where the plane is transitive under a group of collineations which 
is not necessarily cyclic. He arrives at a representation of the plane essentially 
the same as Bruck’s group difference sets (6). Baer (3) has used a limited kind 
of transitivity to coordinate planes. Bruck and Kleinfeld have shown that 
alternative planes are transitive on quadrangles (7). 

In this paper, we are limited to finite planes with m + 1 points on a line, 
where n is odd and not a square. We show that such planes, if doubly trans- 
itive, are Desarguesian. If they are doubly transitive on points not on a special 
line, they are Veblen-Wedderburn planes which cannot be coordinatized by a 
near-field. 


1. Definitions and preliminary lemmas 


DeFINITION: A plane 7 will be said to be doubly transitive if, for every 
two ordered pairs of points A,B, and C,D (where A # B and C # D) there is 
a collineation of x such that C is the image of A and D is the image of B 


Lemma 1. [If a collineation oa leaves fixed every point on a line | and every line 
through a point P, then no point not on | (with the possible exception of P) is 
fixed unless o is the identity. 


Proof. Suppose that Q ¢/ is fixed by o, where Q # P. Since every line 
through Q and a point of / is fixed, every line through Q is fixed. Then every 
point in the plane is fixed, since it must lie on a line through Q and a line 
through P. Points on the line PQ are exceptions, but they must be fixed if 
every other point of the plane is fixed. 


Received November 15, 1955 

1There is a misstatement in (14). On page 422, the remark immediately preceding Theorem 
2.5 should read “... if two ovals have more than half of their points in common... ."" We 
should also remark that the term “perspectivity” as used in that paper is not intended to 
indicate a collineation. 


563 








564 T. G. OSTROM 


Remark. It is fairly well known that if a collineation fixes every point on a 
line, then it fixes every line through some point. 


Definition: A collineation (not equal to the identity) which leaves fixed 
every point on a line / and every line through a point P will be called a 
perspectivity. The line / will be called the axis of the perspectivity and P will 
be called the center. 


Definition: An involution is a collineation of order two. 


LemMA 2. If n is odd and not a square, every involution is a perspectivity in 
which the center does not lie on the axis. 


Proof. A collineation of order two leaves fixed the point of intersection 
of each line with its image. Likewise, a line through a point and its image 
will be fixed. The intersection of two fixed lines is a fixed point. Thus every 
fixed line contains a fixed point and we have already shown that every other 
line contains a fixed point. Our collineation ¢ is what Baer (4) calls a quasi- 
perspectivity. He has shown that every quasi-perspectivity is either a perspect- 
ivity or leaves fixed a subplane of order n}. Since m is not a square, ¢ is a 
perspectivity. If the center P lies on the axis, consider any line (not equal to 
the axis) which goes through P. By Lemma 1, we have a fixed line on which P is 
the only fixed point. The  non-fixed points are interchanged in pairs, and n 
must be even. 


LEMMA 3. If the plane x is doubly transitive, x admits at least one involution. 


Proof. Let A and B be any two points of x. If x is doubly transitive, there 
exists a collineation ¢ such that A = B. Thus a is of even order and, since r 
is finite, some power of ¢ is of order two. 


LemMMA 4. Let p and a be two involutions with the same axis | but with different 
centers. Then po 1s a perspectivity with axis | and center on l. 


Proof. Since every point on / is left fixed by both p and oa, these points are 
left fixed by po. Suppose that some point Q not on /7 is fixed by pc. If p inter- 
changes Q with some point Q;, then o must do likewise and therefore both 
Q and Q, are fixed by pc. If Q = Q:, then Q must be the common center for p 
and o. If Q@ ¥ Qu, po is the identity by Lemma | and p = co. 


LEMMA 5. Let p and oa be two involutions with the same center P but with 
different axes. Then po is a perspectivity with center P and axis | which goes 
through P. 


LemMA 6. Let p and a be two involutions such that the axis of each is incident 
with the center of the other. Then po is an involution with axis going through the 
centers of p and oa and center on the intersection of the axes of p and a. 








' 
| 
' 


TRANSITIVITY IN FINITE PROJECTIVE PLANES 565 


Proof. Let p have center P and axis /. Let Q and m be the center and axis 
respectively of c. Let / and m intersect in R. Then P € mand Q € I. P,Q, and 
R are fixed by both p and ga, and therefore by pe. Likewise, lines /, m, and PQ 
are fixed by both p and ¢. Let A and A, be two oints on / interchanged by c. 
Since all points of / are fixed by p, pe also interchanges A and A;. Similarly, 
if p interchanges B and B, on m, po also interchanges B and B,. Thus, except for 
P, Q, and R, the points on / and m are interchanged in pairs by pc. 

Let X be any point not on the sides of the triangle PQR. Let PX intersect / 
in the point A and let QX intersect m in the point B. Let A = A; and B= B, 
under pe. Then if PA, and QB, intersect in X,, X and X, are interchanged by 
po. We conclude that po must be an involution with axis PQ and center R. 


Definition. If there is an involution with center P and axis /, | will be said 
to be an axis of P. 


2. Doubly transitive planes 


THEOREM 1. Let x be a doubly transitive finite plane, where n is odd and not 
a square. Then x is Desarguesian. 


Proof. By Lemmas 2 and 3, x admits an involution ¢ with some axis / 
and center P ¢/1. Let A be some point on / and A, be some point not on /. 
Consider the collineation p which carries P into P and A into A. This collinea- 
tion will carry / into some line /, # /. Then /, will be an axis of P with respect 
to the involution p~'ep. 

Let / and /, intersect in Q. Then Q lies on at least two axes of P. Suppose that 
Q lies on exactly k axes of P. Given any point Q; # P, there is a collineation 
which leaves P fixed and carries Q into Q;. Hence every point except P lies 
on k axes of P, since a collineation which fixes P maps axes of P into axes of P. 

Now no line through P can be an axis of P, since n is odd. Consider any line 
through P. This line will contain m points besides P, each lying on k axes of P. 
Hence there are exactly nk axes of P. 

If 1 is some axis of P, each of the n + 1 points on / must lie on k — 1 axes 
of P besides / itself. Hence 


nk = (n+ 1)(k-—1)+1 


and k = n. Therefore, each of the n? lines not through P is an axis of P. 
Now, let 0 + P and let o¢ = a, o2,..., 0, be involutions corresponding 
to the m axes of P which go through Q. Then 1 = oo, oo2,...,00, are n 
distinct collineations, which by Lemma 5 are perspectivities with center P 
and axes which go through P. Moreover, they all have the same axis PQ, 
since Q is fixed in each case. Let A be any point not on the line PQ. The 
images of A under the collineations r, = oa, all lie on the line AP. Moreover, 
if r; # ry, the image of A under 1, is different from its image under r, for 
otherwise 7,7 ;~' would leave A fixed as well as every point on PQ and every 








566 T. G. OSTROM 


line through P. By Lemma 1, r,7;~' would then be the identity. Since there 
are n collineations r,, A can be carried into every point on AP except P by 
some collineation r;. 

Baer (3, Theorem 6.2) has shown that this implies that the Minor Theorem 
of Desargues is satisfied for center P and axis PQ. By further use of double 
transitivity, there are sufficient collineations to leave P fixed and carry PQ 
into any line through P, and finally P can be taken into any point. Thus the 
Minor Theorem of Desargues is satisfied throughout and the plane is alter- 
native (9). But Zorn (18) has shown that every finite alternative division 
ring is Desarguesian. 


3. Planes doubly transitive except for points on a special line 


LEMMA 7. Let the plane x admit a group of collineations = which is doubly 
transitive for points not on the line l,,. Then (a) if P € l,,, the group which leaves 


P and l., fixed is transitive on finite points and (b) = is transitive over points on 
a 


Proof. Without loss of generality, we can assume that = leaves /,, fixed, 
since otherwise x is doubly transitive on all points and the plane is Desargue- 
sian. Let A, B and A,, B, be two pairs of finite points such that A, B and P 
are collinear and also A;, B;, and Q are collinear, where 0 € /,. Then the 
member of = which carries the pair A, B into the pair A;, B, will carry P 
into Q. If P = Q, P is fixed. 


Definition. A perspectivity with axis /,, and center on /,, will be called a 
translation. 


LemMMA 8. Let x be a plane of order n, where n is odd and not a square. If x 
admits a group = of collineations as in Lemma 7, then x admits a translation. 


Proof. As in Lemmas 2 and 3, # admits an involution with some axis ! 
and center P ¢ 1. Suppose first that /,, is the axis of ar involution and P ¢l, 
is the center. Since we have transitivity on finite points, there is another 
involution with center P; # P, and with /,, as axis. In this case, Lemma 8 
follows from Lemma 4. 

On the other hand, suppose that there is an involution with some ordinary 
line / as axis. Since /,, is fixed, the center P is on /,,. From Lemma 7(a), every 
finite point is on an axis of P and P has more than one axis. 

Now if two axes of P go through the same point Q € /,, it follows from 
Lemma 5 that there is a perspectivity with center P and axis through P. 
Moreover, Q is on the axis (by Lemma 1) which must therefore be /.,. 

Thus, Lemma 8 is established except for the possibility that no two axes 
of P intersect in a point on /,. Let m be an ordinary line through P. Corres- 
ponding to the m finite points on m, there are n axes of P. If no two of them 
intersect on a point of /,, every point on /,, except P must lie on an axis of P. 
From Lemma 7(b), it follows that a similar statement must hold for every 








TRANSITIVITY IN FINITE PROJECTIVE PLANES 567 


point on /,.. Thus, there exist points P and Q € l,, for which the conditions of 
Lemma 6 apply and /,, is the axis of some involution. But we have already 
shown that Lemma 8 follows in this case. 


THEOREM 2. Let x be a plane of order n, where n is odd and not a square and 
let x be doubly transitive on points not belonging to the line |... Then (a) x is a 
Veblen-Wedderburn plane and (b) x is Desarguesian if any of the coordinate 
rings (using l,, as the special line) satisfy the left distributive law or the associative 
law for multiplication. 


Proof. By Lemma 8, x admits a translation. Suppose that A — B under a 
translation. From the double transitivity, it follows that any finite point A, 
can be carried into any finite point B, by a translation. (Recall that, without 
loss of generality, we can assume that /,, is fixed by all collineations.) Thus 
finite points are transitive under translations. André (1) calls such planes 
translations planes and has shown that they are Veblen-Wedderburn planes. 

André also has divided translations planes into six mutually exclusive 
classes (2, Theorem 4). In cases II and IV, the plane is not transitive on 
infinite points, contrary to Lemma 7. Case III is Desarguesian if x is finite. 
Case V is doubly transitive on finite points, bv* is a square ( = 9). The 
remaining two cases are those in which # is Desarguesian or in which no 
coordinate ring satisfies either the left distributive law or the associative law. 


REFERENCES 


1. J André, Ueber nicht-Desarguesche Ebenen mit transitiver Translationsgruppe, Math. Z. 
60 (1954), 156-186. 

, Projektive Ebenen tiber Fastkérpern, Math. Z. 62 (1955), 137-160. 

3. R. Baer, Homogeneity of projective planes, Amer. J. Math. 64 (1942), 137-152. 








4 , Projectivities with fixed points on every line in the plane, Bull. Amer. Math. Soc. 52 
(1946), 273-286. 
5. , Projectivities of finite projective planes, Amer. J. Math. 69 (1947), 653-684 





6. R H. Bruck, Difference sets in a finite group, Trans. Amer. Math. Soc. 78 (1955), 464-481. 

7 and E. Kleinfeld, The structure of alternative division rings, Proc. Amer. Math. Soc. 
2 (1951), 878-890. 

8. T. A. Evans and H. B. Mann, On simple difference sets, Sankhya 11 (1951), 357-364. 

9. M. Hall, Projective planes, Trans. Amer. Math. Soc. 54 (1943), 229-277. 

, Cyclic projective planes, Duke Math. J. 14 (1947), 1079-1090. 

11. A. J. Hoffman, Cyclic affine planes, Can. J. Math. 4 (1952), 295-301. 

12. H B Mann, Some theorems on difference sets, Can. J. Math. 4 (1952), 222-226. 

13. T. G. Ostrom, Concerning difference sets, Can. J. Math. 5 (1953), 421-424. 

, Ovals, dualities, and Desargues's theorem, Can. J. Math. 7 (1955), 417-431. 

15. J. Singer, A theorem in finite projective geometry and some applications to number theory, 
Trans. Amer. Math. Soc. 43 (1938), 377-385. 

16. L. Skornyakov, Projective planes, Translation No. 99, Amer. Math. Soc. (1951). 

17. G. Zappa, Sui piani. grafica finiti transitivi e quasi-transitivi, Ric. di Mat., Il (1953), 

74-287. 
18. M. Zorn, Theorie der alternativen Ringe, Abh. Math. Sem. Hamb. Univ. 8 (1930), 123-147. 











Montana State University 











RESOLVENTS OF CERTAIN LINEAR GROUPS 
IN A FINITE FIELD 


L. CARLITZ 
1. Introduction. Let F,=GF(q) denote the finite field of order 
q = p", where p is a prime. Consider the group [I of linear transformations 
(1.1) x’ = (ax + b)/(cx + d) 


with coefficients a, b, c, d € F, and of determinant 1. The order of T is 


$q(q? — 1) or g(g? — 1) according as gq is odd or even, i.e., according as p > 2 
or p = 2. Put 


(1.2) J = I(x) = Qhetnl-e-o (p > 2), 
where 
(1.3) L =xt—x, Q = (x® — x)/(xt — x) = Le" 41; 


when p = 2 the factor 4 in the exponents in the right member of (1.2) is 
omitted. It is familiar that L is the product of distinct linear polynomials 
x +a and Q is the product of distinct irreducible quadratics x? + ax + b. 
Moreover (1, p. 4) J is an absolute and fundamental invariant of I, that is, 
every absolute invariant is a rational function of J. The equation 


(1.4) J(x) = y, 


where y is an indeterminate, is normal over F,(y) with Galois group I. 

If we put w = L-» or L™ according as p > 2 or p = 2, then (1.2) and 
(1.4) imply 
(1.5) (u? + j)hern = yut (p > 2), 
(1.6) (u+ 1)! = yut (p = 2), 


resolvents of degree g + 1. The principal object of the present paper is to 
construct resolvents of lower degree when they occur. It is well known (see 
for example (2, p. 287)) that [ can be represented as a permutation group of 
degree <q only when 

(1.7) q = 5,7, 9, 11, 


in which case the degree is 5, 7, 6, 11, respectively. Resolvents are constructed 
for the minimum degree in each case. For example when g = 5 the quintic 
resolvent is 


(1.8) & — 2 = J, 
while for g = 7 we get 
(1.9) w' + 4w® — 4w* = J. 








Received October 8, 1955. 
568 


' 
' 
| 
’ 





RESOLVENTS OF LINEAR GROUPS 569 


When g = 4, (1.6) is a quintic. In this case we construct a sextic resolvent 
(1.10) e+ th = J. 

Incidentally when g = 9, we again get the equation (1.10). However it should 
be observed that in the one case (1.10) has group YW, while in the other the 
group is W.. 

Finally in §7 we consider briefly the ternary linear group. For g = 2 the 
group is of order 168 and we construct a resolvent of degree 8. In this case the 
resolvent of degree 7 is easily found (compare the case g = 4). 

For the discussion of the corresponding problems in the classical case the 


reader is referred to (3, Ch. 13; 5; 7). 

2. q = 5. In this case [ is icosohedral and has a tetrahedral subgroup 
generated by 
(2.1) , , x+2 


This gives rise to the 12 functions 





(2.2) dey et SES, SSF ete FTF. 
Applying the second of (2.1) to (x* + 1)/x? we get 

(2.3) t = T/L’, 

where 

(2.4) T = T(x) = x'? + 2x* + 2x4 + 1. 


Since x‘ + 1 = (x? + 2)(x? — 2), it is clear that T is the product of six 
irreducible quadratics. Consequently 


(2.5) Q = TU, 
where U is a polynomial of degree 6; we find that 
(2.6) U = U(x) = x*§ — x* + 1. 


Since the function (2.3) belongs to a tetrahedral subgroup of I, it must 
satisfy an equation of degree 5 with coefficients in F;(J). While this equation 
can be found by the method of undetermined coefficients it is easier to make 
use of the identity 
(2.7) T?(x) — U*(x) = 2L‘, 


which can be verified without difficulty. Incidentally (2.7) is one of a set of 
five identities obtained by replacing x by x + c, c = 0, 1, 2, 3, 4. Using (2.3), 
(2.6), (2.7) we get 

(2.8) | i — 2 = J. 


This proves 


THEOREM 1. For q = 5, (1.4) admits the quintic resolvent (2.8). 








570 L. CARLITZ 
It may be noted that Garrett (6) has proved that a quintic equation in a 
field of characteristic 5 can in general be reduced to the form 
(2.9) 2° + az? + 5b = 0. 
Replacing ¢ by 1/z in (2.8), we evidently get an equation of the form (2.9). 
3. g = 7. The group T is now the simple group LF(2, 7) of order 168. We 


require a subgroup ©, of order 24. Such an octahedral subgroup is generated 
by 


1 2 3 4 0 3 
(3.1) 3 = (; . Ss = ¥ ‘) $3 = > = 


The transformations s;, s; generate a dihedral subgroup D, of order 8; a 
function belonging to D, is 


(3.2) & = (x? + 2x — 2)4/L. 
Applying s; to — we find that 

(3.3) t= T*/L', 
where 


(3.4) T = (x? + 2x — 2)(x? + 4x — 1)(x? + x — 4) = xf — x? -— 1 


belongs to the group G,. Consequently ¢ satisfies an equation of degree 7. 
It is however more convenient to find the equation of degree 7 satisfied by 


(3.5) w=t—4= W/L’, 
where 
(3.6) W = T* — 4L*. 
We observe first that W|Q. To prove this let a® = — 1, a € GF(7?). 
Then by (3.4), T(a) = — a* — a, which implies T*(a) = 3a*; also L*(a) = 


(a’ — a)*, so that 
W(a) = 3a’? + 4a’ = 0. 
This implies x* + 1| W(x). Now applying the substitution s,, we find that W 


is a product of distinct irreducible quadratics, in particular it is clear that 
W|Q. Also (3.6) implies (W, T) = 1. We have accordingly 


(3.7) Q = TWU, 


where U is a polynomial of degree 12. 
Returning to (3.5) we now construct the equation of degree 7 satisfied by w. 
This is evidently of the form 


w' + ayw'+...+ aw = bJ 
or what is the same thing 
(3.8) W? + a,W*L? +... + a,WL" = 50+. 


It follows immediately from (3.7) that a, = as = ag = 0; also b = 1. Since 








RESOLVENTS OF LINEAR GROUPS 571 


W = x** — 4x"! + ..., comparison of coefficients yields a; = 0, a, = 4, 
a; + a; = 0. Thus (3.8) reduces to 

(3.9) W? + 4W°*L* — 4W*L*® = 0+. 

In terms of w this is 

(3.10) w' + 4w® — 4w* = J. 


This proves 
THEOREM 2. For g = 7 (1.4) admits the resolvent (3.10) of degree seven. 


If we substitute from (3.7), (3.9) becomes 


(3.11) W? + 4WL* — 40° = T*U". 
Next using (3.6) we get 
(3.12) T? + 2T'L? + 3L¢ = VU. 


In terms of T above, (3.12) becomes 

(3.13) (T* — 4L*)4 (T" + 2T°L* + 37T'L) = Q, 

from which the equation for ¢ follows at once: 

(3.14) (t — 4)* (2 + 227 + 32) = J. 

This equation can also be obtained directly from (3.10). 
Concerning the polynomials 7, U, W we may state 


THEOREM 3. The polynomials T, U, W satisfy (3.6), (3.7), (3.11), (3.12). 


4. q = 11. The group T is now the simple group LF(2, 11), of order 660. 
We require a subgroup Y, of order 60. Such an icosahedral subgroup is generated 
by (see for example (4, p. 479)) 


(4.1) 3 = a 1 Ss: = r a 


of period 5 and 2, respectively. Note that 


(4.2) $152 = (; - 


which is of period 3. It is easily seen that (x* + 1)/(x — 3) is invariant under 
Se and next that (x!® + 1)/(x* — 1) is invariant under (4.1). A little computa- 
tion now shows that 


(4.3) t = T*/L', 
where 
(4.4) T = x*9 + 5x? + 5x? + 5x!? — 5x® + 1, 


belongs to %;. Notice that T is a product of distinct irreducible quadratics, 
so that 7|Q. 








572 L. CARLITZ 


In the next place application of s, to the quadratic x* — 5x + 2 gives 
H, = x'® + 5x* — 1. Applying ses,* to x? — 5x + 2 we get x? — 4x + 2 and 
this gives H, = x'® — 2x* — 1. If we put 


(4.5) H = HH, = x” + 3x — x! — 3x5 + 1 
we find that 
(4.6) h = H*/L* 


also belongs to Y;. Note that H, like 7, is a product of distinct irreducible 
quadratics. Moreover it is not difficult to verify that T and H satisfy the 
relation 


(4.7) T? — H® = L'; 
in terms of ¢ and & this is 
(4.8) t—h=1. 


(For the polynomials corresponding to 7, H and L in the classical case, see 
(5, p. 54). The differentiation method used there is however not applicable 
here.) 


Since (4.7) implies (7, H) = 1, it follows that 


(4.9) Q = THU, 
where U is a polynomial of degree 30. It is also easily verified that 
(4.10) u = U/L* 


belongs to the group Ys. Thus each of the functions ¢, h, u satisfies an equation 
of degree 11, which we shall now set up. We notice first that 


(4.11) U = T?+ 41. 


To prove (4.11) put ¢(x) = (U — T*)/L* and let 8 be a, number in some 
extension of F, such that 8 and its conjugates under YW, are distinct; we may, 
for example, take 8 as the root of an irreducible polynomial of the third degree 
Then since ¢(x) is invariant under %; we have $(8,) = $(8), where 8, is any 
conjugate of 8 under &;. Then ¢(x) — (8) vanishes for 60 distinct values of 
x; since deg (x) < 60 it follows that ¢(x) is constant. Comparison of coeffic- 
ients now yields (4.11). Incidentally (4.7) can be proved in a similar way. 

Making use of (4.11) it is not difficult to find the equation of degree 11 
satisfied by u. This equation is of the form 


u'! + aywu'® + ~2e $A = J 


or what is the same thing 


(4.12) U" + a.UL§ +... + adyUL® = Q*. 

Since U|Q we have ag = .. . = aio = 0. Also since all terms in Q have expo- 
nents divisible by 10, it is clear that a; = 0. Thus (4.12) becomes 

(4.13) Ui + a.U®L" + ...+ a,L% = T*H*. 


Using (4.7) and (4.11) we may rewrite (4.13) in terms of T; the resulting 








RESOLVENTS OF LINEAR GROUPS 573 
relation is of degree 10 and must therefore be an identity in 7. Comparing 
coefficients we readily find that 


a, = 6, as = 3, ag = 3, Og = Ae. 
Thus (4.12) becomes 


(4.14) U" + 6U°L” + 3U°®L" + 3U'L”® + 6U*L® = 0, 
and therefore 
(4.15) u'! + 6u® + 3u* + 3u7 + 6u* = J. 


We may rewrite (4.14) as 
U* + 6U°L” + 3U°L" + 3UL” + 6L*® = T*H* 
and remark that the left member is 
(U — 5L*)?(U* — U? + 4U + 2) 
= (U — 5L')?(U — 4L')? 
= (T? — L')*7* = H°T*, 
by (4.7) and (4.11), which is correct. Conversely we may obtain (4.14) by 


retracing these steps. 
In view of the above it is convenient to rewrite (4.15) as 


(4.16) u®(u — 5)?(u — 4)8 = J. 
The corresponding equations for ‘ and h are 

(4.17) f(t — 1)2(¢ + 4)§ = J 

and 

(4.18) h?(h + 1)%(h + 5)* = J. 


We may state 


THEOREM 4. For g = 11, (1.4) admits the resolvents (4.16), (4.17), (4.18) 
of degree 11. 


THEOREM 5. The polynomials T, H, U satisfy (4.7), (4.9), (4.11) and 
(4.14). 


5. q = 4. When gq = 4, the equation (1.6) becomes 
(5.1) (u + 1)° = yu', 
where u = (x* — x)*. Thus (5.1) is a quintic resolvent of (1.4). The group in 
this case is U;. We shall construct a sextic resolvent. This can be done most 
rapidly by making use of an irreducible quadratic, say 
(5.2) P=x*+x+ 4, 
where ¢?+ ¢6+1=0, @ € Fy. Now put 


- 2 
(5.3) t= 7%. 











574 L. CARLITZ 


It is easily verified that ¢ belongs to the dihedral group D; of order 10 generated 
by 


1 ¢ 1 ¢? 
(5.4) 5; = % i ' $2 = (; 1 ) 


Thus ¢ must satisfy an equation of degree 6. Indeed from (5.2) 
P?+P=x4+x+1=L+1, 


Q=L'4+1 = (°4+P4+1'4+1=P§+P+P+ PRP. 
= P'+ P(P?+ P +1), 
so that 
(5.5) Q = P*+ PL’. 
Using (5.3), (5.5) becomes 


which reduces to 
(5.6) +f =% = J. 
This proves 


THEOREM 6. For q = 4, (1.4) admits the resolvent (5.6) of degree 6 as well as 
the resolvent (5.1) of degree 5. 


We remark that if x denotes any solution of the equation J(x) = y then the 
solutions of ¢® + ¢ = y are the six irreducible quadratics 


x?+x+¢, x? +x+ ¢’, x? + ox + 1, x? + ox + 9, 
x* + ox + 1, x? + ox + ¢?. 


6. g = 9. The group T is now of order 60. We require a subgroup of 
index 6. Such an icosahedral subgroup Y; is generated by 


0 1 01 
(6.1) 5; = * 1+ ,) sg = & i): 


where o? = — I. It is easily verified that 
si°5 = $2? = (s52)* = 1, 


so that YW; is indeed the icosahedral group. 

Using (6.1) we find that 

(6.2) u = U*/L*, 

where 

(6.3) U = x!2 — x10 a x* — x? — 1, 





RESOLVENTS OF LINEAR GROUPS 575 


belongs to Ws. Since U is a product of 6 distinct irreducible quadratics, we 
have 


(6.4) Q = TU, 
where T is a polynomial of degree 60. Moreover 
(6.5) t = T/L‘ 


also belongs to Us. Consequently we have a relation of the form U* — T = cL'*, 
or what is the same thing 


(6.6) Ut — Q = cL*U. 

Comparing coefficients of x** in both members of (6.6) we get c = 1, so that 
(6.7) Ut — L8U = Q. 

Using (6.4) this becomes 

(6.8) T* + L*T® = Q'. 


In terms of ¢ as defined by (6.5), (6.8) yields 
O° 
(6.9) += L== rd 


We remark that it is not difficult to verify (6.7) by direct computation. 
Also (6.7) implies 
(6.10) u(u—1)§=J, 


which is equivalent to (6.9). We may state 


THEOREM 7. For gq = 9, (1.4) admits the resolvents (6.9) and (6.10) of 
degree 6. 


We shall next construct an equation of degree 6 with group Ws. This can be 
done by using one of the quadratic factors of U, for example x* — 1 + ca. 


We have 
(x? — 1 — a) (x — 1+ 0) — o(x? —1+.0)* 





(6.11) = (1 — a) (x!? — x9 + x® — x? — 1) = (1 — o )U, 
(x9 — 1+ ¢)? — (x? — 1+ «)(x* — 1+ 0) 
(6.12) = (1 — o) (x8 + xl? + x?) = (1 -- o )L?. 
Put 
g *-—ite 
( i= 7) 
(6.13) . w oat —1+e) 


Then by (6.11) 





- (l—o)U 
(6.14) w- l= 











576 L. CARLITZ 


On the other hand it follows from (6.12) that 








2 i ee (1 = o)L* 
w+ile= @—1+e)™’ 
so that 
. ____(i+e)L 
(6.15) w+il1l= @—i+e™ 
Comparison of (6.15) with (6.14) yields 
Fg (w — 1)* 
6 — - —_ 5 ee 
(6.16) w' +1 778 (w — 1) - 
If we make the substitution 
l—u-—z 
(6.17) Cnr ate 
(6.16) becomes 
(6.18) 26+ 25 = u(l — u)5. 
If we put z = v — 1, (6.18) takes on the more symmetrical form 
(6.19) u(l — u)§+ v(1 — v)' = 
Alternatively, since u — ¢ = 1, we have 
(6.20) 2+ 2° + + 2 = 0, 


where ¢ is defined by (6.5). 


We omit the verification that z belongs to a dihedral subgroup D; of U; 
and state 


THEOREM 8. For gq = 9, the equation (6.20) has group Us relative to F,(t). 


It is of interest to compare (6.20) with (6.9). Thus for J an indeterminate, 
(6.9) has group Us, while for —J = ¢* + ¢® the group reduces to Ws. Since ¢ 
belongs to Ws, this is in agreement with a familiar theorem on the effect on 
the Galois group of an adjunction to the coefficient field. In this connection 
we remark that a quintic with group &; relative to F(t) is evidently 

2° — t® 2° _ ° 
(6.21) ae? aar™ 





7. The ternary group. Define 

xt y@ 2 
xv y” 2’): 
[x® y® st] 





(7.1) [jk] = 





in particular put 


(7.2) L = [012], 9, = 073) 9, = 13) 


{012} ’ Q: = [012] ° 





RESOLVENTS OF LINEAR GROUPS 577 


Then L, Q:, Q2 are homogeneous polynomials in x, y, z and (see, for example 
(8, p. 17)) form a full system of invariants for the ternary linear group over 
F,. Moreover x, y, z satisfy the equation 


(7.3) et? = Ont — O.et + Leg, 

Indeed the general solution of (7.3) is furnished by 

(7.4) ax + by + cz (a,b,c € F,). 
Now in particular when g = 2, the ternary group T is of order 168, 

(7.5) deg L = 7, deg Q; = 6, deg Q. = 4 

Also (7.3) becomes 

(7.6) g = Oot? + it + L, 

an equation with group I. 
Let 

(7.7) X = y2? + yz, VY = x2? + x*2, Z = xy* + x*y. 


Then by (7.6) 
Z* = x*(Qey* + Qiy? + Ly) + y*(Qex* + Qix? + Lx) 
= 0,2? + L(x*y + xy‘), 
Z* = QiZ* + L*x*(Qay* + Qiy? + Ly) + L*y*(Qex* + Qix* + Lx), 


so that 

(7.8) Z* + QiZ* + L*Q.Z? + L'Z = 0. 

Similarly X and Y also satisfy (7.8); indeed the general solution of (7.8) is 
(7.9) aX +b6VY+cZ (a, b, c€ Fs). 
It follows that 

(7.10) L(X YZ) = L’, Q:(X YZ) = L*Q2, Q2(X YZ) = Qi. 


We shall now construct a resolvent of degree 8 for the equation (7.6). To 
do this we make use of irreducible factorable polynomials over F:, that is 
polynomials of the type 


2 
(7.11) I] (& + a”y + 6”2) (a, B € Fs). 


i=0 
The condition that (7.11) be irreducible (relative to F:) is that a or 8 be a 
primitive number of Fs. We shall restrict our attention to those polynomials 
(7.11) that are of rank 3, that is those for which 1, a, 8 are linearly independent 
relative to F.; it is easily verified that the number of such polynomials is 8. 
If we define the field Fs by means of 


(7.12) ; ¢ = ¢'+1, 
then the 8 polynomials in question are given by 


(7.13) (a, 8) = (¢, 6"), (6, 4°), (6, 6%), (¢, 4%), 
(*, *), (6%, O°), (6%, o), (6°, o*). 











578 L. CARLITZ 


The polynomials (7.13) are permuted by I; each is left invariant by a certain 
subgroup of order 21. By direct computation we find that the polynomials are 
Py = x8 + y® + 2? + xyz + x*y + x2 + yz 
P, = x* + y® + 2° + xyz + x*y + x2? + yz 
Ps = x? + y® + 2° + xyz + x*y + x72 + yz? 

Py = x8 + y® + 2? + xyz + x*y + x2? + yz? 

Py = x? + y® + 2? + xys + xy? + x22 + yz 
Po = x? + y® + 2° + xyz + xy? + xz? + y’z 
Py = x* + y® + 2° + xyz + xy? + x*2 + yz? 

Ps = x* + y*® + 2° + xyz + xy? + x2? + ys". 

Using (7.7) we find that the polynomials P, can be exhibited as 
Pi, +aX+6V¥+cZ (a, b,c € Fs). 


Consequently if the equation of degree 8 satisfied by P, is f(€) = 0, then writing 
& = 7+ P,, we have f(7 + P:) = 0 when 7 takes on the values (7.9). It 
follows that f(m + P:) is identical with the left me:nber of (7.8). Hence we 
get 


(7.14) + QiZ'+ 1°Q.2°+L'Z =A 
as the equation satisfied by P,;, where 


8 
(7.15) A= [| P,. 
j=l 


It remains to compute the coefficient A. Since deg A = 24 and A is an 
invariant we have 


A = aQi + 60} G2 + cQt + dL’Q; Qn, 


and it is only necessary to determine the constants a, b, c, d. We readily com- 
pute the following special values: 


Q,(11z) = 2* + 27, Q2(1lz) = 2 + 22+ 1, L(ilz) = 0. 
In particular 
Q,(111) = 0, Q2(111) = 1, L(111) = 0. 


Since for xyz = 111 each P, = 1 it follows that c = 1. We also find from the 
explicit polynomial expressions for P,, that for xy = 11 each reduces to 
z* + 2+ 1 or z* + 2? + 1. This yields the identity 


(26 + 2° + 2f + 2? + 2? + 2+ 1)4 = a(z* + 2?) 
+ b(2* + 2*)?(2t + 2? + 1)* + (24 + 2? + 1)%. 
Put z = ¢, ec? + «€ + 1 = 0, and we get a = 1. For z = ¢ we get 
0 = (6+ 1)* + b@ + 1)" + 4%, 


“7 





RESOLVENTS OF LINEAR GROUPS 579 


so that b = 0. To get the coefficient d we take xyz = $¢%*. We find that 
L(o¢7o*) = 1, Q:1(6676*) = Q2(¢7G*) = 0. Also it is easily verified that each 
P, = 1. It follows that d = 1. Hence (7.14) becomes 


(7.16) f+ %+UOQ8 + Lt = +8410: On. 


We may now state 


THEOREM 9. For g = 2, the equation (7.16) of degree 8 has the Galois group 
LF(3, 2) of order 168. The solutions of (7.16) are the irreducible factorable 
cubics P,; if P, is a particular solution then the general solution is 


Pi, +aX + 6Y + cZ, 
where X, Y, Z are defined by (7.7) and a, b, c€ Fs. 


REFERENCES 


1. L. E. Dickson, Am invariantive investigation of irreducible binary modular forms, Trans. 
Amer. Math. Soc. 12 (1911), 1-8. 
. L. E. Dickson, Linear Groups (Leipzig, 1901). 
———, Modern Algebraic Theories (New York, 1923). 
. R. Fricke, Die elliptische Funktionen und ihre Anwendungen, 11 (Leipzig and Berlin, 1922). 
, Lehrbuch der Algebra, 1 (Braunschweig, 1926). 
. J. R. Garrett, Normal equations and resolvents in fields of characteristic p, Duke Math. J. 
18 (1951), 373-384. 
7. F. Klein, Vorlesungen tiber das Ikosaeder und die Auflésung der Gleichungen vom fiinften 
Grade (Leipzig, 1884). 
8. D. E. Rutherford, Modular Invariants, Cambridge Tracts in Mathematics and Mathematical 
Physics, No. 27 (Cambridge, 1932). 





our wn 


Duke University, 
Durham, North Carelina 











PRIME POWER REPRESENTATIONS 
OF FINITE LINEAR GROUPS 


ROBERT STEINBERG 


1. Introduction. There are five well-known, two-parameter families of 
simple finite groups: the unimodular projective group, the symplectic group,' 
the unitary group,’ and the first and second orthogonal groups, each group 
acting on a vector space of a finite number of elements (2; 3). If & is the 
dimension of this space, we denote these groups by &, G,, Uy, O, and O,’, 
respectively. By analogy, groups D2, D4 and ©,’ (which are not simple) can 
be defined. Our main conern then is the proof of the following result: 


THEOREM. Let @ be one of the groups %,, Sz, Ur, Oy or Oy’ with k > 2. 
Let p be the characteristic of the base field, let d be the order of a p-Sylow subgroup 
$ of G, and let m be the index of the normalizer J in G. Let = be any vector 
space of dimension d over a field of characteristic 0 or prime to m. Then © has an 
irreducible representation of degree d with = as the representation space. 


The special case G@ = %, was proved by Jordan (7) and Schur (12), inde- 
pendently; the case G@ = &; by Brinckmann (1); and the case G = &, first 
by the present author (13) and then later by Green (5). In (4), Frame proved 
the theorem when @ = 1;. All of these authors dealt only with the character 
of the representation, not with the representation itself. The methods of the 
present paper are constructive and yield the representation space and the 
representing matrices explicitly. It is hoped that the geometric ideas introduced 
in this construction may be of independent interest. 

In §§2, 3, 4, and 5, the group &, is dealt with. In §6, the other groups are 
considered. In §7, a few observations are added. 

As a general reference to the definitions and properties of the spaces and 
groups to be considered, we cite (2) and (3). 


2. Preliminary definitions and notations. Throughout §§2, 3, 4 and 5, 
V denotes a vector space of dimension m over a field of g elements and of 
characteristic ». The symbol S’ denotes an r-dimensional subspace of V; 
if the superscript is omitted, the dimension is to be taken as 1; subscripts are 
used to distinguish subspaces of the same dimension. The symbol {S*‘, S’, . . .} 
denotes the subspace spanned by 5S‘, S’,.... 


Definition 1. An r-simplex is an ordered set of r linearly independent 
l-spaces: [S;, S2,...,5,]. Each S, is called a vertex of the simplex. An n- 
simplex is more briefly called a simplex. 

Received January 16, 1956. 


1Sometimes called the abelian group. 
*Sometimes called the hyperorthogonal group. 


580 








OO ee 


ee 


REPRESENTATIONS OF FINITE LINEAR GROUPS 581 


Definition 2. A composition sequence (abbreviated to c.s.) is a sequence 
of subspaces [S', S?,...,S"] such that S’ C S*'(j = 1,2,...,” — 1). 


Definition 3. Let A = [S;, S2,...,5,] be a simplex and V = [S', S*,..., 


S"] a c.s. Suppose that there exists a permutation ¢ of the numbers 1, 2,...,” 
such that 
(1) S = [Se, Se, --- » Sep} (j = 1,2,...,). 


Then V is called a face of A: a positively or negatively oriented face according 
as o is even or odd. Each of the m! faces of A determines an opposite face 


Vi = [S,', S;?, cees S;"] defined by 
(2) Si = { Sew)» Tito~2ds ee | Sote- #»} Gj = a 2, re | n). 
Our first result is a useful characterization of opposite faces: 


LemMA 1. Jf V = [S',S*,...,S"] and V; = [S,', S:’, ... , S:"] are two faces 


of a simplex A = [S;, S2,...,5,] then a necessary and sufficient condition for 
V and V, to be opposite is that 
(3) S'Q ST’ =0 (j = 1,2,...,"—1). 


If V and V; are two c.s. for which (3) holds, there is a simplex A, uniquely 
determined to within an ordering of its vertices, which has V and V; as (opposite) 
faces. 


Proof. The assumption that V and V, are opposite faces of A implies the 
existence of a permutation ¢ such that (1) and (2) hold. But then, since the S, 
are linearly independent, (3) holds. If V and V; are faces which are not 
opposite, there exists a permutation r, different from ¢, such that 


Si = { Sr), Sr@—p,--- > Sr@—H-} ($m 3,3) 0 i< 9h 


If j is the first index such that o(j) ¥ r(j), then Sx, CS’) S,"~’, contra- 
dicting (3). Suppose finally that V and V; are c.s. for which (3) holds. Then it 
follows that 

S; = ¥ a) sx 


is 1-dimensional for each 7. Thus these S, are the only possible choices for 
vertices of a simplex A relative to which V and V; are opposite faces. To 
complete the proof, we note that, for each j, S,; C S’ but S, Z S*"'. Thus the 
S, are linearly independent so that A = [S;, S2, ...,5S,] is a simplex and the 
equations (1) and (2) hold with oa the identity. 


3. The spaces = and =*. We proceed now to define representation spaces 
and to develop some of their properties. If A is a simplex and V a c.s., we 
introduce an inner product (A, V), defined to be 1, —1 or 0 according as V 
is a positive face, a negative face or not a face of A. If F is an arbitrary but 
fixed field, we can extend this inner product, by linearity, to linear combinations 
of simplexes and to linear combinations of c.s. over F. In this way, relative to 








582 ROBERT STEINBERG 


this inner product, dual spaces = and =* are determined. Thus an element of 
> (2*) is a linear combination of simplexes (c.s.), and it is defined to be 0 
if and only if it is orthogonal to all elements of =* (2). 

If e(¢) is defined to be 1 or —1 according as ¢ is even or odd, an immediate 
consequence of the definitions is the following: 


LEMMA 2. If [Si, S2,...,5S,] is @ simplex and o a permutation, then 
[Se¢1» Sec2)s seee Sein] = e(o)[Si, S2, sey S,]. 


If A is an (m — 1)-simplex and S a linearly independent 1-dimensional 
subspace, then [.S, A] is used in our next result to denote the n-simplex whose 
vertices are obtained by taking first S and then the vertices of A in a positive 
order. 


LemMMA 3. Let {A} be a set of (m — 1)-simplexes, all contained in one 
(m — 1)-dimensional subspace S* of V. Let S be a 1-space not in S*-'. Then 
> A = 0 implies ¥- [S, A] = 0. 

Proof. To each face V = [S', S*,...,S*-"] of A we make correspond n 
faces Vi, V2,...,V, of [S, A] defined by 


¥, = (F',S,...,F", [S, Ff}, (5S, F},..., {S¢, Ff"); @ = 1,2, ...,8). 
Then one sees that, for each k, ([S, A], Vz) = (—1)*"'(A, V). The required 
result now follows by summation on A with V and JV, held fast. 


LeMMA 4. Let S,, So, ...,Sp41 be 1-spaces, and, for k = 1,2,...,n+1, 
let Ay = (Si, Sa, ~~~ » Se—ty Se» Seat, - ++» Sayil, where S, denotes that this vertex 
is to be omitted. Then 
(4) D’ (—1)* A = 0, 


the summation being over those A,, which are simplexes. 


Proof. Suppose first that no m of the S, are linearly dependent. Let V = [S', 
S2,...,5S"] be any face of A,,:. Thus there is a permutation o¢ such that (1) 
holds. It is easy to see that 


(Ansi1, V) = €(¢), (Asam, V) = (—1)*"e(e), (A;, V) = Oif 7 H mn + 1, 0(n). 


Thus the left side of (4) is orthogonal to each face of A,4;. Similarly, it is 
orthogonal to each face of A;, As,..., 4, (since an interchange of two S, 
changes the sign of this sum); hence it is 0. 

In proving the general case, we may assume that m of the S,, say S;, S2,..., 
S,, are linearly independent and that S,,; is linearly dependent on S,_,41, 


S,—2+2, ---,5, but on no smaller number of S, (j = 1,2,...,m). Then the 
k-dimensional case of the frst part of the proof shows that the analogue of 
(4) holds for the k + 1 k-simplexes formed from the vertices 5,~.4:, . . . , Sp4s- 


By Lemma 3 (applied » — k times), we may prefix each of these simplexes 
with the vertices S;, Ss, ..., S,-, and get our result. 


EE a Ee” 








‘“ ‘ve — 





SS Or Oe en oe, 


REPRESENTATIONS OF FINITE LINEAR GROUPS 583 


We now introduce ~~venient bases for = and =*. 


THEOREM 1. Let 'So', So?, ... , So"] be a fixed c.s. Let B be the set of 
simplexes A such that >) = (—1)4"). For each A, in B, let V, be the face 
opposite to Vo. Let B* ve the set of such faces. Then the sets 8 and B* are dual 
bases of = and  *. 


Proof. We first prove that 8 spans 2. Let SB, (r = 0,1,2,...," — 1) 
be the set of simplexes A = [S;, S:,...,5,] such that 
Si = {Si, Ss,..., Ss} (j= 1,2,...,7). 


Thus %> consists of all simplexes, and $,_, = 8. We now show that any 
member of %, is the signed sum of at most m — r members of %,,,;. Let 


A = Se. e* - s 2 sates ee, T,] 
be in B,. Let S,,; be any 1-space in So’*! 7) {T,41,..., T,}. Then, by Lemma 
4, applied to the (m — r)-space {7,4:,..., 7}, the (” — r)-simplex [T,4:,..., 


T,,] is a signed sum of at most m — r (m — r)-simplexes, each of which has 
S,41 as a vertex. By Lemma 3, A is a signed sum of at most » — r members of 
B41. 

To complete the proof, we invoke Lemma 1, which implies that, if A,, V, 
are in B, B*, then (A,;, Vz) = 5. Thus % is linearly independent, hence is a 
basis of £, and ¥* is the dual basis of =*. 


COROLLARY. A simplex A is the signed sum of those members A, of B which 
have a face V; in common with A and which have Vo as the opposite face, the 
signature being positive or negative according as the common face V , does or does 
not have the same orientations on 4 and A,. The sum consists of at most n! terms. 
If 4 is not a member of B, the sum of these signatures is 0. 


Proof. The first two statements follow from the equation A = }>(A,V,) A, 
which is valid since 8 and B* are dual bases. The equations (A,, Vo) = (— 1) 
then imply the third statement. 


4. The Sylow subgroup {. We now turn to the group &, of unimodular 
projective transformations. Since we are concerned only with the permutations 
of simplexes effected by members of &%,, and since a scalar transformation 
leaves all simplexes fixed, we may work with &, via representative elements of 
the unimodular group. Similar considerations apply to the other groups dealt 
with in §6. 

The order and existence of a useful p-Sylow subgroup of &, is given by the 
next two lemmas: 


Lemma 5. The order of a p-Sylow subgroup of G = &, isd = gi". 


LemMA 6. Let Vo be a given c.s. Let N be the set of elements of © which leave 
Vo fixed, and let % be the subset of N composed of elements whose orders are 
powers of p. Then & is a p-Sylow subgroup of G and N is its normalizer in ©. 











584 ROBERT STEINBERG 


Proof. For Lemma 5, the order of @ is available (2; 3). To prove 


Lemma 6, we let Vo = [So', So’, .. . , So"] and then choose an ordered basis 
X = (x), X2,...,%,) of V such that 
(5) Si = {x1, X2,..., 2X4} {oe} a | 


Then, relative to the basis X, N consists of all subdiagonal matrices, and B 
of those which in addition have only 1’s on the main diagonal. All conclusions 
now easily follow. 

We proceed to set up a 1-1 correspondence between the elements of the 
- p-Sylow subgroup $ defined in Lemma 6 and the members of the basis 8 of = 
defined in Theorem 1. Again let X be a basis of V satisfying (5). Relative to 
X, each element P of $ is represented by a matrix whose rows 5}, S2,..., Ss 
may be interpreted as vectors in V. It is easy to see that the simplex 


A= (—1)™ [{si}, {se}, eees {sn} ] 


is a member of % and that the correspondence @ defined by 6P = A is 1-1 
from $ onto %. From the fact that each row in the product of iwo matrices 
is the image of the corresponding row of the first matrix under the :ransforma- 
tion corresponding to the second matrix, it follows that (@P;)P:, = 6(P;P:), 
where P, and P; are any two elements of $. Thus the right multiplication by 
P on the set $ is mapped by @ onto the application of P to the set 8; and this 
mapping is an isomorphism since @ is 1-1. We may sum up the results of this 
paragraph in the following theorems: 


THEOREM 2. The dimension of = (or =*) is equal to the order of a p-Sylow 
subgroup of ©. 


THEOREM 3. If 6 is the mapping from $ onto B defined in the preceding 
paragraph, then 0 induces an isomorphism between the right regular representation 
of B and the group $ considered as acting on the set B. The group $ is simply 
transitive on the members of B. 


5. The representation . Two final geometric results are necessary for 
the proof of the main theorem. 


LemMA 7. Let m be the index of the normalizer of a p-Sylow subgroup of 
@ = &,. Then 


(i) m is the number of c.s.; 


(ii) m = (it - ») / (q — 1)*. 


Proof. The first statement follows from Lemma 6 and the fact that the 
group G is transitive on the c.s. For the second statement, see (13). 


LemMA 8. Let Ay = (Si, S2,...,5,] be a simplex. Let V, be the face of Ao 
corresponding to the permutation oc. For each a, let {A,,}(j = 1, 2,...) be the set 





— 





a Sar 


ees 





REPRESENTATIONS OF FINITE LINEAR GROUPS 585 


of simplexes which have V, as a positive face. Let m be the integer defined by 
Lemma 7. Then 


(6) LX «(e) Ae; = m Ao. 
¢.J 


Proof. An arbitrary simplex A makes one appearance on the left side of 
(6) for each face that A has in common with Ao, the signature being positive 
or negative according as this face does or does not have the same orientation 
on A and Ap. This face determines a unique opposite face V» of A. If we keep 
Vo fixed and sum over those terms of (6) which give rise to Vo in this way, 
then, by the corollary to Theorem 1, we get Ao. We then sum over V» to get 
the stated result. 

In the case that G@ = &,, we now state our main result: 


THEOREM 4. Let G = &, and let R be the representation induced* in the 
space = by G. Further suppose that f is a p-Sylow subgroup in @, that B is the 
basis of = defined by Theorem 1, and that d and m are the order of Y and the index 
of the normalizer of %, respectively. (These numbers are given by Lemma 5 and 
Lemma 7.) Then 

(i) in the sense of Theorem 3, R restricted to $ is equivalent to the right regular 
representation of B; the degree of R is thus d; 

(ii) relative to B, R is represented by a set of matrices each of which has only 
entries of 0, 1 or —1; in each row, at most n! non-zero entries occur, and their sum 
is 0, if the row has more than one such entry; 

(iii) af the base field F of the space = has characteristic prime to m, then R is 
irreducible —in particular, this is so if the characteristic is 0 or p. 


Proof. Statements (i) and (ii) follow from Theorem 3 and the corollary to 
Theorem 1. To prove (iii), we show that the enveloping algebra of ® consists 
of all linear transformations from = to 2. First, choose a basis X of V such 


that (5) holds, and set S, = {x,} (j = 1,2,...,m). We now note that, 
corresponding to each permutation o of the numbers 1,2,...,m, there 
exists an element Q, of &%, such that S,Q, = Sx». If o is even, Q, may be 
defined ‘by x,Q, = x; if o is odd, Q, may be defined by x.0, = — xox, 
%1Qe = Xen, j © 1. If we now let Ay be the simplex [S;, S:, .. . , S,], it follows, 
by Theorem 3 and Lemma 8, that, for each A, in 8, 

(7) 4,Q =m dA, 


where Q = > e(c)P,Q., the summation being over all permutations « and 
all elem-nts P, of $. Now, let V’ be the face of Ay opposite to V» and let 
B’ = {A,'} (with Ao’ = (—1)""A,) be the corresponding basis of 2, as given 
by Theorem 1. By Lemma 1, the only member of 8’ which has Vp» as a 
face is Ao’. By the corollary to Theorem 1 and by (7), 


Since the elements of @ leave (A, V) invariant, they induce well-defined linear trans- 
formations in Z and =*. A similar remark applies in the case of Theorem 4’. 











586 ROBERT STEINBERG 


(8) AsQ = m Ab, AQ = 0, i #0. 
By Theorem 3, applied to the basis %’, there exists, for each i, an element 
Pf! of @ such that 

(9) AP’, = A‘. 

Now, for each pair i, j7, we set 


Ts; = + prt OP". 
By (8) and (9), it follows that Al(Ty = Aj; A,’T i; = 0, k # 1. Since the Ty 
form a basis for the linear transformations from = to 2, the proof of irreduci- 
bility is complete. 


6. The symplectic, unitary and orthogonal groups. In this section, we 
consider the modifications necessary in the preceding development if the 
group &, is replaced by the other classical linear groups. 

In the case of the unitary group, V denotes a vector space over a field of g?* 
elements and of characteristic »; in the other cases, the field is to have g 
elements. 

The symplectic group, S:,, has an invariant, skew, bilinear form of pairs 
of vectors, x = (a,), y = (8;), which may be taken as 


(*) (x,y) = D> (a2 jBn+3 — On+s8,). 
For the unitary group, Ul,, this is to be replaced by 
(*) (x,y) = DD (0 Bass + Ons $34) 


with B = B*; for Uen+1) a term Cn+ 1B 2041 is added. 
For the first orthogonal group, D2,, we choose the quadratic form 


(*) Q(x) = Do ayanss; 


j= 
a term a72,41 is to be added in the case of D»,4:; an irreducible quadratic form 


IN Gon41 ANd Gmy2 is to be added for ’s,,2, the second orthogonal group. 
In these three cases, we introduce the inner product 


(x,y) = QO + ¥) — Q(x) — QU). 
Thus, in all cases, the concept of orthogonality of pairs of vectors exists 
Unless the contrary is stated, it is assumed in what follows that @ is any one 
of the groups Son, Uses, Usnr1, Den, Done: Or O’ony2 with m > 1, the group 
@ = Oon41 being excluded if g is even, since then G is isomorphic to Ss,. 
The symbol c(.S’) denotes the subspace orthogonal to S’. 


Definition 4. If V underlies an orthogonal group, and if ¢ is even, a subspace 
is isotropic if each of its vectors annuls the quadratic form Q. In all other 
cases, a subspace is isotropic if every two of its vectors are orthogonal. 


f 
| 





— - 


REPRESENTATIONS OF FINITE LINEAR GROUPS 587 


Definition 5. A special 2r-simplex is an ordered set of 2r isotropic 1-spaces 
[S:, Se, . .. , S2-] for which there exist vectors s, in S, such that 


(Sy, Sx) = 0, (Sr4ys Sr+n) = 0, (Sy, Sr4n) = bx (j, k = 1, 2, re | r). 


It is clear that the vertices of such a simplex are linearly independent. The 
vertices S, and S,,, are termed opposite. We shorten “special 2n-simplex"’ to 
“simplex’’. 

The existence of isotropic n-spaces and of simplexes follows at once from the 
equations (*). In each case, the first basis vectors span an isotropic n-space 
and the 2m 1-spaces generated by the first 2” basis vectors are the vertices of 
a simplex. 


Definition 6. A special composition sequence (s.c.s.) is a sequence of n 
isotropic subspaces [.S', S?,. . . , S"] such that S’ C S’*! (j = 1,2,...," — 1). 


Definition 7. An admissible permutation (a.p.) of the numbers 1, 2,... , 2n 
is.a permutation ¢ such that o(m + j) = n + o(j) (mod 2n) (j = 1,2,...,m). 


It is to be noted that an a.p. is determined by its effect on 1, 2,...,. The 
a.p. form a group of order 2" nm! isomorphic to the hyper-octahedral group 
(15). Each a.p. ¢ induces a permutation @ of the m pairs (j, m + j). We set 
é'(c) = e(c) €(6). 


Definition 8. Let A= [S,, Ss, . . . , S2,] be a simplex and V = [S', S?,... , S*] 
an s.c.s. Suppose that there exists an a.p. o such that 
S = {Sa, Sam, --- + Sec} (j = 1,3,...,%). 


Then V is termed a face of A: a positively or negatively oriented face according 
as «(¢) is 1 or —1. The face VY; of A which is opposite to V is defined by 


Vi = [S,', S;?, cosy S;"], Si ™ { Secn+-1); Seint2)s eens Seia+p} G = i. 2, eens n). 
The spaces = and >* are defined as in §3. 


Lemmas 1, 2, 3 and 4 have analogues which are: 


LemMaA 1’. Jf V = [S',S,...,S"] and V,; = [S,', S;,...,5S;"] are two 


faces of a simplex A = [S,, S2, . . . , Son}, then a necessary and sufficient condition 
for V and V, to be opposite is that 
(3’) S’ 1) c(S{) = 0, rr.) oo * 


If V and V, are two s.c.s. for which (3’) holds, there exists a simplex A, 
uniquely determined to within an ordering of its vertices, which has V and V, 
as (opposite) faces. 


LemMA 2’. If [S;, S2,...,S2,] is a simplex and o is an a.p., then 


[Sec Seas coe Sec2n)] _ é’ (o) (Si, Bicccs Son]. 











588 ROBERT STEINBERG 


LemMA 3’. Let [Si,.S2] be a special 2-simplex contained in a 2-space S?. 
Let {A} be a collection of speciai (2n — 2)-simplexes contained in c(S*). For 
each A, let A’ be the special 2n-simplex which has S; and S, as its first and (n +- 1)st 
vertices and the vertices of A, taken in positive order, as its remaining vertices. 
Then > A = 0 implies > A’ = 0. 


Lemma 4’. Let A be a simplex and S an isotropic 1-space. Then A can be 
expressed as a sum of simplexes each of which has S as a vertex. 


Proof. The proofs of Lemmas 1’, 2’ and 3’ are virtually the same as those of 
Lemmas 1, 2 and 3, and so may be omitted. As a first step in the proof of 
Lemma 4’, we check two special cases. If m = 1, A = [S;, S2], S # Si, S ¥ S:, 
then it is easy to verify that [.S,, S] and [S, S.] are simplexes (see Definition 5), 
and that 

(Si, S2] = [S:, S] + [S, S:]. 


Next suppose that m = 2, A = [S,, Ss, Ss, S4], and that S is orthogonal to 
exactly one vertex of A, say to S;. Then, if 


T = c(S) C)\ {So, S3}, U = c(S)\ {S3, Sa}, 
the required conclusion may be drawn from the equation 
[Si, So, S3, Si] = [S, via Ss, U} + [Si, So, T, S] + (Si, S, U, Si). 


The rest of the proof consists in showing that any other case can be reduced 
to one of these two cases. We may suppose that m > 2 and that S is not 
orthogonal to a pair of opposite vertices, say S, and S,,;, since, then, the 
restriction to c({S,, S,4:}) and an application of Lemma 3’ effectively replaces 
n by n — 1. Thus we may suppose that the two vertices S,,, and S,,2 are not 
orthogonal to S. Now set 


T = c(S) CV {Spat Srae}, U = c(T) CV Si, S2}. 
Then the following is a relation among special 4-simplexes, all in one 4-space: 
(10) [Si, So, Srot-te Sa+2] = [Si, U, rs Sn+2] + [U, Se, Soot T}. 


By Lemma 3’, if the vertices S;, ..., S,, Snss,...,S2, are adjoined to these 
4-simplexes, an expression is obtained for A as a sum of two simplexes each of 
which has at least one vertex orthogonal to S. If n > 3, this construction can 
be repeated, with the indices 1 and 2 replaced by 2 and 3, to yield a second 
vertex orthogonal to S. Finally, if m > 2 and S is orthogonal to two vertices 
of A (which may be taken as non-opposite), say to S; and S:, and not ortho- 
gonal to S,,, and S,,2, then the same construction yields, on the right side of 
(10), two simplexes, each of which has a pair of opposite vertices orthogonal to 
S; and this case has already been considered. 

In the statement of Theorem 1, the number (—1)""! is to be replaced by 
(—1)", in the present case; in the corollary to Theorem 1, n! by 2" n!. No 
changes are required in the proof. 





e 


—_ 


REPRESENTATIONS OF FINITE LINEAR GROUPS 589 


The analogue of Lemma 5 is: 


Lemma 5’. The order of a p-Sylow subgroup of @ is 


d n? n(2n—1) n(2n+1) n(n—1) n? ge 


- q ’ q ’ q ’ q ’ q or 
according as 


G = Son, Un, Usen+1, Dons Don+1 or Diin+2- 
Proof. See (2;3) for the order of G. 


The statement of Lemma 6 goes over intact and the proof is similar; so 
both may be omitted. The same remark applies to Theorems 2 and 3. 

We now note an exception that occurs (only) in the case that G© = Da,. 
Then the isotropic m-spaces form two families such that two members of the 
same family (of opposite families) intersect in a space of dimension m — r 
with r even (odd), and such that the elements of D2, permute these m-spaces 
within their separate families (3, p. 48). The first property implies that at 
niost one-half of the 2" isotropic m-spaces spanned by sets of m vertices of a 
given simplex can fail to intersect a given isotropic m-space; thus, in the 
corollary to Theorem 1, the number #! may be replaced by 2"~' m!, in this case. 
The second property implies that the group Dz, is not transitive on all of the 
s.c.s., only on one-half of them. Thus the analogue of Lemma 7 takes the 
following form: 


LEMMA 7’. Let m be the index of the normalizer of a p-Sylow subgroup of ©. 
Then 


(i) of G = Do, m is one-half the number of s.c.s.; if G© = Son, Usa, Usasr, 
Don+1 OF D'on42, m is the number of s.c.s.; 


(ii) if G = San, Usa, Users, Don, Danes OF Odnye, then 

-( (q*! - »)/«@ - v(t (q’ - (-1) / (g — 1)", 

(TT 1— (-1)/ @- , 

@ -0( Fh @ »)/ @-17, 

eo or @ + (TL « -1)/@-". 


Proof. Part (ii) is easily established by counting the number of s.c.s. using 
induction on n. If S is an isotropic 1-space, one may invoke the induction 
hypothesis on the quotient space c(.S)/S with the induced definition of isotropy 
We omit the details. 

In the modified statement of Lemma 8, only admissible permutations are 
to be considered; if G@ = Osz,, a further restriction is to be made to even 








590 ROBERT STEINBERG 


permutations. No essential change cccurs in the proof. The analogue of 
Theorem 4 may be stated as follows: 


THEOREM 4’. Let & be one of the groups San, Un, Uons1, Don, Dongs OF D one. 
Let R be the representation induced* in the space 2 by G. Further suppose that 
$ is a p-Sylow subgroup in G, that B is the basis of = defined by Theorem 1’, 
and that d and m are the order of $ and the index of the normalizer of 8, respect- 
ively. (These numbers are given by Lemma 5’ and Lemma 7’). Then all con- 
clusions of Theorem 4 are valid if, in (ii), the number n! is replaced by 2"—' n! if 
@G = D2, and by 2" n! in all other cases. 


The proof of Theorem 4 carries over without essential change. 
Theorems 4 and 4’ imply the theorem stated in the introduction. 


7. Concluding remarks. Our first remarks take the form of two con- 
jectures which, if true, provide converses to the theorem of the introduction: 


CONJECTURE 1. The group © does not have an irreducible representation of 
degree d over a field whose characteristic divides m. 


We are able to prove the following weaker result: 


THEOREM 5. Using the notations of Theorems 4 and 4’, if the characteristic 
of F divides m, then the representation ® is reducible. 


Proof. It is convenient to introduce “boundary” operators 6 and b* on 2 
and ~*: for each simplex A, let bA be the signed sum of the faces of A, the 
sign of a face being that of its orientation on A; for each c.s. V, let b*V be the 
sum of those simplexes which have V as a positive face; then extend } and b* 
to all of 2 and =* by linearity. Lemmas 8 and 8’ may now be rewritten as: 
b*bAy = mAo. Thus, if A is a simplex and V a c.s., it follows that (6*b A, V) = 
m(A,V), and this is easily seen to be equivalent to (b*V,bA) = m(A, V). 
The assumption m = 0 then implies that b* 2* and b= are orthogonal. It is 
easy to see that neither of them is 0. Hence b* =* is a proper non-zero subspace 
of 2. This subspace is invariant under ®: if G is an element of @ and V is a 
c.s., then (6*V)G = b*(VG). Thus ® is reducible. 


CONJECTURE 2. The notation being that of Theorems 4 and 4’, any proper 
subgroup $ of @& does not have an irreducible representation of degree d. In 
particular, the restriction of R to $ is reducible. 


If G = 2, 3, Us or Sy, this statement follows from results of Moore (11), 
Wiman (14), Hartley (6) and Mitchell (9; 10), who have shown that, in these 
cases, every proper subgroup § of @ has order less than d?. 

In (13), an alternative method is used to derive the character of ® in the 
case that G = &,. There, use is made of a correspondence between &, and the 
symmetric group of degree n. If G is one of the other groups considered in this 
paper, a similar correspondence exists between @ and the hyper-octahedral 











f 


Frwo we se 








Ee . 


REPRESENTATIONS OF FINITE LINEAR GROUPS 591 


group of the appropriate degree, and yields the character of R. However, this 
method leans heavily on a previous determination of the characters of the 
symmetric and hyper-octahedral groups and does not deal with the representa- 
tion itself. 

Our final observation is that the special case m = 3 of the corollary to 
Theorem 1 also follows from a theorem on graphs (8, p. 126). 


REFERENCES 


1. H. W. Brinckmann, The group characteristics of the ternary linear fractional group and of 
various other groups, Bull. Amer. Math. Soc., 27 (1921), 152. 
2. L. E. Dickson, Linear groups in Galois fields (Leipzig, 1901). 
3. J. Dieudonné, La géométrie des groupes classiques, Ergeb. Math. (Berlin, 1955). 
4. J. S. Frame, Some irreducible representations of hyperorthogonal groups, Duke Math. J., 
1 (1935), 442-448. 
5. J. A. Green, The characters of the finite linear groups, Trans. Amer. Math. Soc., 80 (1955), 
402-447. 
6. R. W. Hartley, Determination of the ternary collineation groups whose coefficients lie in the 
GF(2*), Ann. of Math., ser. 2, 29 (1925-26), 140-158. 
7. H. Jordan, Group-characters of various types of linear groups, Amer. J. of Math., 29 (1907), 
387-405. 
8. D. Kénig, Theorie der endlichen und unendlichen Graphen (Chelsea, New York, 1950). 
9. H. H. Mitchell, Determination of the ordinary and modular tenary linear groups, Trans 
Amer. Math. Soc., 12 (1911), 207-242. 
, The subgroups of the quaternary abelian linear group, Trans. Amer. Math. Soc., 15 
(1914), 379-396. 
11. E. H. Moore, The subgroups of the generalized finite modular group, Dec. Publ. Univ. of 
Chicago, 9 (1904), 141-190. 
12. I. Schur, Untersuchungen iiber die Darstellung der endlichen Gruppen dur-h gebrochene 
lineare Substitutionen, J. fiir Math., 132 (1907), 85-137. 
13. R. Steinberg, A geometric approach to the representations of the full linear group over a Galois 
field, Trans. Amer. Math. Soc., 71 (1951), 274-282. 
14. A. Wiman, Bestimmung aller Untergruppen einer doppelt unendlichen Reihe von einfachen 
Gruppen, Handl. Svenska Vet.-Akad., 25 (1899), 1-47. 
15. A. Young, On quantitative substitutional analysis (fifth paper), Proc. London Math. Soc., 
ser. 2, 31 (1930), 273-288. 





Institute for Advanced Study 








ON INDEFINITE TERNARY QUADRATIC FORMS 
B. W. JONES anv G. L. WATSON 


1. Introduction. The first systematic study of equivalence of indefinite 
ternary quadratic forms seems to be that of A. Meyer (10) (see also Bachmann 
(1)). By methods which are often obscure he showed that the number of 
classes in a genus is a power of 2, the exact power depending on certain quad- 
ratic characters associated with the form. These investigations, however, 
dealt only with forms of odd determinant, in the classical sense (in our nota- 
tion, A = 0(mod 2) and d = 4(mod 8)). Donald Marsh (9) established an 
algorithm by which the number of classes may be determined. Eichler (4) 
has, as a consequence of deep and general theory, thrown much light on these 
questions. 

Here, using concepts closely related to the spinor genera of Eichler, we 
define a multiplicative group I, of square-free integers prime to d, the deter- 
minant of f. Further we show that IT, has a subgroup y(f) consisting of all 
those elements of I, which are denominators of rational automorphs of f, 
where by the demoninator of a matrix we mean the I.c.m. of the denominators 
of its elements. We show that the number of classes in the genus of f is equal 
to the order of the factor group I',/y(f). In the process of deriving this result 
we get information about the automorphs which yield much new information 
(see Theorem 2) about the representation of numbers by indefinite ternary 
quadratic forms. An alternative definition of a group I'(p, f) is given in §3 
by means of which the order of the factor group above can be determined. 


2. Notation. For certain matrices with integral elements we shall use 
the notation: 


x1 0 —ZX3 Xe 

, P 
x= Xe], =» = (x1, X2, X3), = & X3 0 —X1 
xX3 —Xs x1 0 


Latin capitals will denote 3 X 3 non-singular matrices, with rational elements, 
I being the identity matrix. Other letters will denote integers unless otherwise 
stated, p being always a prime. 

For a ternary form, we use the notation 


Ff = f(&) = f(s, Ee, Es) = FEAE, A = (ays) = (0*f/0E, OE,), 
and define the invariant (2, pp. 4, 5) 


d=d(f) = —}|A|. 


Received September 19, 1955. 
592 














INDEFINITE TERNARY QUADRATIC FORMS 593 


We assume that f has integral coefficients and is indefinite and non-degenerate 
(i.e. that d # 0). We also assume that f is primitive, that is, that its coefficients 
have greatest common divisor 1. The non-primitive case can easily be deduced 
from it. We note that A, being symmetrical and having even diagonal elements, 
is congruent (mod 2) to a skew matrix, which is singular (mod 2); hence 
d is integral. 

As we are concerned with properties invariant under integral unimodular 
transformations, we may suppose when necessary that the form f is a repre- 
sentative of its class satisfying one or other of the following congruences: 


3 
(2.1) f= 2d Madi (mod p*), b ¥ ayaxds,0 = dy < de < Ass; 
(2.2) f = 2" o(E:, &) + 2ak} (mod 2°), A, As = 0, A0rA, 0 
(2.3) f = fits + d&i (mod p*), p 4d, 


where in (2.2) a and the discriminant of @ are odd. This is possible for any 
prescribed 8; we assume always that 8 exceeds the highest exponent on the 
right side of the congruence by at least 2 + (—1)’. For a proof that every f 
is equivalent to a form satisfying (2.1) if p > 2, or one of (2.1), (2.2) if p = 2, 
see, for example, Jones (6, pp. 84, 85). Starting with either of these it is easy 
to obtain (2.3). 

The exponents A,, A are unique for a given f and are invariants of the genus, 
as are the possible values of the quadratic characters of the a, and a modulo 
b or 8. But the latter are not always unique. 


3. Definition of certain groups and statement of results. We consider 
the set of integers b ~ 0 for which, for any prescribed p and f, we can find ¢ 
so that 
(3.1) t'A =0 (mod p’*), f(t) = b (mod p*), p + t 
with 6 > 0 such that p*||b, and r = 6 + 2 + (—1)”. Note that if (3.1) is 
soluble, then f has the automorphism § — — £ + #t’A£/f(t) reducing in case 
t/! = (1, 0,0) to 

£1, Eo, 3 > £1 + 2air (aiste + aists), —E2, —Es. 


The denominator of this automorph is a divisor of p~*f(t), which is prime to 
p and congruent to p~* b modulo or 8. 

Now we observe that the set of positive and negative square-free integers 
v forms a group, I’, with the operation 


(3.2) V4°V2 = 0yV2(01, V2). 


Any subset of I closed under this operation is a group. For any 6 for which 
(3.1) can be satisfied, write 
(3.3) —db = uv, ve TY. 


Note that v is not altered by multiplying f by any integer. We define I'(p, f) 
to be the sub-group of I generated by all v arising from (3.3). 








594 B. W. JONES AND G. L. WATSON 


We denote by [I,, the subgroup of I defined by (v, m) = 1; and we use the 
groups I'(p, f), p\d, to define a certain sub-group of I',. 

We define y(f) as follows: g € y(f) if and only if g € T, and there exists a 
w such that w|d and 


wg € n I'(p, f). 


Note that with g, also —q belongs to y(f); also that y(f) is a group since 
(wigi) (Wage) = (wi-We)(gi-g2). 

By the denominator of a rational matrix we mean the least common de- 
nominator of its elements, with either sign. We shall show that f may be taken 
into a form in the same genus by a transformation whose matrix has any 
prescribed denominator g in T'y. On the other hand, f has an automorph with 
denominator g in I, if and only if g € y(f). We shall thus prove 


THEOREM 1. The number of classes in the genus of f is equal to the order of the 
factor group T4/7(f). 


Alternatively, if »y > 0 is the number of distinct characters in the set 
(3.4) (42\q) if 2\d; (g\p) if p > 2, pid 


(where the symbols are Jacobi symbols), and if these are capable of 2° distinct 
sets of values (each +1) for g in y(f), then the class-number is 2”-*. 

Note that I'(p, f) and y(f) are invariants of the genus of f, since forms in 
the same genus have the same congruence properties. The same remark 
therefore applies to the invariant dy = do(f) which we now define: dy is the 
product of all the distinct odd primes p for which T, Z I'(p, f), multiplied by 
2 in case 5 ¢ (2, f). 

We leave it to the reader to verify that the alternative statement of Theorem 
1 remains valid on putting d» for d in (3.4). We shall see that ['(p, f) = T, 
when p ¢ d, so that dy divides d. We can now state 


THEOREM 2. Suppose n is represented by at least one but not by all of the 
classes of forms in the genus of f, and write dn = nyn2", nm, square-free. Then 
(i) m > 1; 
(ii) m, divides do; 
(iii) m, = 1(mod 8) if d ts odd; 


(iv) if (p, 2d) = 1 and (m,|\p) = —1, p cannot divide nj; 
(v) the number of classes in the genus that represent n is equal to the number 
that do not. 


We conclude this section with an alternative definition of ['(p, f), in which 
f is assumed to satisfy (2.1) or (2.2) as the case may be and by means of which 
x(f) could be computed. Below o,, denotes 1 or 0, according as A; + A, is 
odd or even. 




















INDEFINITE TERNARY QUADRATIC FORMS 595 


ALTERNATIVE DEFINITION OF I(p, f) 


(a) p > 2. I'(p,f) is the sub-group of I generated by 
(i) the group of f with (v|p) = 1 
(ii) the set of v = p*sa,a,(mod p'**#) for any i, 7; 
(iii) the group I,, if two of the exponents A,, As, As are equal. 
(b) p = 2 and (2.2) holds. ['(2, f) = I or T's, according as \ is odd or even. 
(c) p = 2 and (2.1) holds. ['(2, f) is generated by the set of » = 1(mod 8) 
or 2%ia,(mod 2***s), together with the following integers, reduced mod 8, 
if the stated conditions hold for any unequal i, ;: 
(i) 1 + ay, if Ay = A, and a, = a, (mod 4); 
(ii) 5, if Ay — Ay = O, 2 or 4; 
(iii) 3, if As < 2; 
(iv) 1 + 2aq,, if Ay — Ay = lor 3. 
The equivalence of this to the earlier definition will be proved in §5. 


4. Rational automorphs. We make use of the rational automorphs 
S, Si, S2, . . . of the form f, or of its matrix A. We shall consider only automorphs 
with determinant 1; we lose nothing thereby, since f has always the trivial 
automorph —/J with determinant —1. By the denominator of S we mean the 
least common denominator of its elements with either sign. 

We are thus concerned with matrices S satisfying 


(4.1) S'AS = A, |S| = 1. 
The solution of (4.1) was found by Hermite (5) and may be written (with 


u # 0, v square-free, A = adj A): 

(4.2) xo — df(x) = u’ v, 

(4.3) u’v (I + S) = 2x5 I + xd & — dxx’A. 
We shall need the following results: 


LEMMA 1. (i) Whenever xo, x have integral values such that the left member of 
(4.2) does not vanish, S = S(xo, x) defined by (4.2), (4.3) satisfies (4.1). 

(ii) Conversely, if S satisfies (4.1) there exist integral xo, x, u, v satisfying 
(4.2), (4.3); the integer v = v(A, S) is uniquely determined by A, S as are the 
ratios of Xo, X1, X2, Xs. 

(iii) J + S is singular if and only if xo = 0. 

(iv) We have with the notation of (3.1) 


(4.4) S(xo, —x) = S—'(xo, x), 

(4.5) v(A, S;S2) = v(A, S;)-v(A, S2). 
(v) If the transformation T is non-singular, then 

(4.6) v(T’AT, TST) = v(A, S). 


Formulas (4.2) and (4.3) are not new; see, for example, Bachmann (I, 
pp. 81-108). However, for completeness, we here give a proof in modern 








596 B. W. JONES AND G. L. WATSON 


notation based on Cayley’s theorem (3) which is easily derived (8, p. 66). 
This theorem states that if S is an automorph in a field F of a symmetric 
matrix A in F such that J + S is non-singular, then there exists a skew matrix 
Q in F such that A + Q is non-singular and 

(A+Q)S=A-Q, 


and that all such automorphs can be expressed in this form. 


In our notation, AA = —2dI. If we choose x» ¥ 0 and x so that dt = x0Q, 
the product 
(A + Q)(2xoI + xod & — dxx’A) 
reduces to 
(4.7) 2xtA — dAxx’'A + dz A x. 
We shall verify below that the following identity holds: 
(4.8) tAaz+ (x'Ax)A = Axx'A. 


Using this, (4.7) reduces to {2x 9? — 2df(x)} A which, in virtue of the non- 
singularity of A + Q, yields formulas (4.2) and (4.3). Note that 7+ S 
non-singular implies x» ~ 0. This, subject to verification of (4.8), completes 
the proof of sections (i) and (ii) for J + S non-singular. 

If J + S is singular, a theorem of Stieltjes (12) states that J + S is not of 
rank 2, a result not hard to verify directly. The theorem of Jones and Marsh 
(7) establishes our result or it may be proved as follows: J + S, being of rank 
1, must be equal to xy’A for two column vectors x and y. Then S’‘AS =A yields 

Ayx'Axy'A = Axy’A + Ayx’A. 
Multiplying on the right by A~'y and on the left by A~! we see that yx’j = 0, 
x’y = Oand hence y = Ax for some non-zero scalar \. Then A?(x’Ax)xx’ = 2\xx’ 
which implies Af(x) = 1. 

It remains to verify (4.8). This is easily done directly for A a diagonal 
matrix. Suppose B is any symmetric matrix with rational elements. There is a 
matrix T of determinant 1 with rational elements such that B = T’AT. 
Then, letting x = Ty, we have A = TBT’ and (4.8) becomes a similar expres- 
sion with x and A replaced by y and B by use of the following identity 
(4.9) R'(Ry) R = |R\9, 
which may be shown as follows: since R’(R y)R is skew, call it and see that 
xy = 0 implies that y = g(R)x, where g(R) is a scalar dependent on R but 
not on y or x. Let J,, and J;, be the matrices obtained from J by interchaning 
the ith and jth rows, multiplying the ith row by r, respectively. Then it is 
easily shown that 

gg) = 1, gas) = —1, gar) = 7, g(RS) = g(R)g(S). 


Thus, g(R) is a linear homogeneous function of the elements of each row of R 
and changes sign when two rows of R are interchanged. Since g(J) = 1, this 
implies from Weierstrass that g(R) = |R|. 























INDEFINITE TERNARY QUADRATIC FORMS 597 


Assertion (iii) is now obvious. 

To obtain (4.4), note that with (A + Q)S = A — Qwehave (A — Q)S"' = 
A + Q. Hence we may replace xo, x, S, Q by xo, —x, S~', —Q in the foregoing 
argument, in the general case. In case J + S is singular, we have only to prove 
S-' = S or S*? = I. This is easily verified from (4.2), (4.3) with x» = 0. 

To obtain (4.6), note that (4.2), (4.3) are unaltered (apart from multiplica- 
tion of (4.3) on the left by 7-' and on the right by 7) on putting 7’AT, 
TST, T-'x, txo for A, S, x, Xo. 

It remains only to establish (4.5). This is best done by using quaternions, 
following Eichler (4), Pall (11) and others. We use a generalized quaternion 
algebra with multiplication defined by 


(4.10) (xo, x) (yo, ¥) = (xoyo + $dx’Ay, xoy + yor — 4Axy). 


The vector x may be identified with the pure quaternion (0, x); and the 
scalar x» with (xo, 0). The conjugate of (xo, x) is (xo, —x), and its norm is 
(xo, —x)(xo, x), which by (4.10) is x9? — df(x). 

It is easily verified that (4.10) defines an associative algebra. The verification 
may be simplified by the device used in the proof of (4.8); for (4.10) is 
invariant under substitution of T’AT, T-'x, T-'y, where |T| = 1, for A, x, y. 
Thus we may suppose A to be diagonal and then (4.10) takes a familiar form. 
From the fact that multiplication is associative, it follows that the norm is 
multiplicative. Thus (4.5) will follow if we show that (4.2), (4.3) are 
equivalent to 


(4.11) (xo, —x) (0, &) (xo, x) = (x0, —x) (xo, x) (0, SE). 
Noting that & = —2é, we see that the left member of (4.11) is 
(Adxot’Ax — $dxox'At — jdx’AA xt, —4d(t'Ax)x + xit + 4xoA 2t 
+44 2[xot + $4 2€)). 
The scalar component on the right is zero, and the vector component is 
(xi + x04 t — 4dxx’A +4 AA x), 
since ‘(¢’Ax)x = xx’At. On the other hand, after transposition of the term 
u*vl, the right member of (4.3) becomes 


xo I + xoA & — dxx'A + 4$d(x’Ax)I. 
To prove (4.11) equivalent to (4.2), (4.3) we have therefore to show that 
hdxx'A + 442A 2% = 4d(x’Ax)I. 
We do so by multiplying (4.8) on the left by A. 
5. The denominator of the automorph S._ It is clear from (4.3) that 


the denominator of S is a divisor of u*v; we investigate whether any factor of 
u*v can cancel out. 








598 B. W. JONES AND G. L. WATSON 


LEMMA 2. Suppose (p,d) = (xo, x1, X2,%3) = 1, and let p* be the highest 
power of p dividing u*v and all the elements of the matrix u*vS; then p* = 1 or 4 
and divides also all elements of u*vS—'. Suppose r is the greatest integer prime to d 
that divides u*v; then the denominators of S and S—' have the same factor prime 
to d, which is either r or 41, provided (1, xo, x1, X2, X3) = 1. 


Proof. It is easily verified that the trace of Az is identically zero; that of 
xx'A = (x,df/dx,;) is 2f(x). We have therefore, from (4.3), 


(5.1) u'v tr(I + S) = 6x5 — 2df(x) = 2u*v + 4x3. 


If we multiply (4.3) on the right by A, we see that, with the present hypothesis 
regarding ~*, we have 


2x5 A + x0A%A + 2d’xx’ = 0 (mod p*). 
The second matrix on the left is skew, the other two symmetrical, so we deduce 
(by adding the transposed congruence) 


4x5 A + 4d°xx’ = 0 (mod p*). 
With (5.1) this gives that p* divides 4x»? and 4d*xx’, hence 4xx’. Now the 
hypothesis (xo, x1, X2, X3) = 1 gives p* = 1, 2 or 4. 
To complete the proof of the first assertion, it only remains to show that if 
2|u*v and u*y S then 4 divides u*v and both of u*vS+'. This is easily proved on 


using (2.3). The second assertion is an obvious corollary of the first. We next 
prove 


LEMMA 3. If the denominator of S = S(O, x) is prime to p, then v = v(S) € 
I'(p, f); and the two definitions of this group given in §3 are equivalent. 


Proof. Putting x» = 0 in (4.2) and (4.3), and supposing without loss of 
generality that » 4 x, we see that for some 6 > 0 we must have 


(5.2) p\|f(x) = —d-'u, p'|xx'A, p'|x’A. 
Hence (3.1) and (3.3) are satisfied with ¢ = x, b = f(x), and v € I'(p,f) 
follows. 


We see from (3.1), (3.3) that ['(p, f) is generated by the group of f with 
(v|p) = 1, or »v = 1(mod 8), together with those given by 


(5.3) v = pv’ (mod p'*~””**), e=Oorl, py’ 


for «, v’ such that (5.2) can be satisfied with »v = p‘v’ and with p {+ x; but note 
that if p|x we may put 6 — 2, p~'x for 6, x. We may, without loss, replace 
(5.2;) by a congruence mod p* with r = 6 + 2 +(—1)? and take u = 2p’ 
where @ is an integer dependent on 6 and d so chosen that v has the form of 
(5.3). 

We first dispose of the case when p = 2 and f is not diagonalizable. Using 
(2.2) (mod 2"), the condition on ¢, v’, obtained from (5.2) as explained above, 
becomes, since 





a 








eo 








INDEFINITE TERNARY QUADRATIC FORMS 599 
d= — Qe: gq d’ odd, 
(5.4) 2° vf = d’{2" (x1, x2) + 2ax5}, 2°|(2"x1, 2x2, 2***xs), 


€ = \2. + 6 (mod 2). 
There are two types of solution of (5.4): 
(i) If 
2°** 4 2" 6(x:, x2) 
we must have 
2°||2" (x1, x2), 8 = Au, ¢ odd. 
In particular, with x3; = 0 we have solutions of this type with « = A; + A. = A 
(mod 2) and vo’ = d’¢(x;, x2). Since d’¢ has an odd discriminant, we can have 
v’ congruent to any odd integer mod 8. 
(ii) If 
2°*"/2" (x1, x2), 
then 
2° ||2*x3, 6 = d2 (mod 2). 
It easily follows that (2, f) = I or I; according as A is odd or even. 
Now with p > 2 assume (2.1), and the condition on «, v’ becomes 


3 
(5.5) p’d’ = a;020;), a; px, p’ |2p™ x, 
t= 
(= 1,2,3),«€= Ai + As +A; +6 (mod 2) 
The case p odd is straightforward. Let p*‘||x,. From (5.5) 


Ac + ps > 6 D> min(A, + 2p,). 
Hence if i is the index for which A, + 2p; is a minimum, we have p, = 0 
and 6 = \,. Now if A, + 2p; = A, it follows that p, = 0. Hence we have two 
possibilities: 
(1) No two ), are equal and 4 is equal to one of them. 


(2) Two , are equal. 


Let i, j, k be 1, 2, 3 in some order and in the first case 6 = Ay, € = Ay + Ay 
(mod 2) and (v'|p) = (a;a2a3a,\p) = (aa,|p) which is condition (ii) of the 
alternative definition of ['(p, f). In the second case, condition (iii) is easily 
verified. 

For p = 2, trivial solutions of (5.5) are obtained as for odd p. To obtain 
any others differing from these either in the value of «¢ or in the residue of v’ 
mod 8 we must for some i, 7 have 


P8 ee PAT” 2 Hs. 








600 B. W. JONES AND G. L. WATSON 


A little calculation shows that, with (5.5), this requires 
Ac f©8 CAG H+ 2, A, -—- 208 GA, +4, 


and so is impossible when the differences of the A, are too large. When they 
are not, the calculations are straightforward and we leave the rest to the 
reader. - 

We next show that the first assertion of Lemma 3 is true for the automorph 
S(xo, x) without the restriction x» = 0. This can be proved by straightforward 
calculations similar to those of Lemma 3; but these are complicated for 


pb =Z. We give an alternative proof, based on the fact that every S is a product 
of S’s with x» = 0. 


Lemma 4. If the denominator of S = S(xo, x) is prime to p, then 
v= v(S) € T'(p, f). 


Proof. We begin by making some preliminary simplifications in the case 
p = 2. First, we assume that the exponents A, in (2.1) are not all equal, and 
that in (2.2) A ¥ 1; for in these two cases, which transform into each other, 
Lemma 3 gives us (2, f) = I, and we have nothing to prove. Next, we note 
that in all other cases we either have 


200 
(5.6) A ={0 0 0] (mod 4), 
000 


if f satisfies (2.1) or (2.2), or the reciprocal form has this property. To show 
that we need only prove the Lemma for one of two reciprocal forms, put 
T =A in (4.6); after a little reduction this gives v(—2dA, S’-') = (A, S). 
Here S’—' is an automorph of and —2dA is a multiple of the coefficient matrix 
of the reciprocal form of f. 

Now by (5.6), if p = 2, or (2.1), if p > 2, we may remove from f() the 
terms in ££, £:¢;. We do this by an obvious transformation with denominator 
prime to p, which affects neither the hypothesis nor the conclusion of this 
Lemma, nor the assumption (5.6); see (4.6). Thus we may assume (5.6) and 


(5.7) f(t) = ati + g(és, 3), p 4 a, 


where the binary form g has a divisor 2 if p = 2, by (5.6), and we may suppose 
that the coefficient of £2; is divisible by at least as high a power of p as that 
of £,”. Writing for convenience 


(5.8) y = (1,0,0)’, z= (0, 1,0)’. 

we must have that ¢ = z satisfies (3.1) and 

(5.9) Zz Ay = 0. 

For convenience write U(#) = S(0,#), and note that U(#)t = t, U(ME = —E 


if ’AE = 0, as is clear from (4.2), (4.3) with xo, x = 0, ¢. 














SO rr re eee 5 ee 








INDEFINITE TERNARY QUADRATIC FORMS 601 


First suppose that Sy = —y. Then Lemma | (iii) shows that x» = 0, 
whence vo(S) € I'(p, f) follows from Lemma 3. 

Next suppose Sy = y. Then from (5.9) we have U(z)Sy = —y, and by the 
case just considered 

v(U(z)S) = o(U(z))-v(S) € Tp, f). 
But as ¢ = z satisfies (3.1), U(z) has denominator prime to p, so by Lemma 3 
v(U(z)) is in I'(p, f) and so must be o(S). 
Now note that U(y + Sy) takes y — Sy into Sy — y, because 
(y’ + y'S’) A(y — Sy) = y'S’Ay — y'ASy = 0, 
and leaves y + Sy invariant. Hence this transformation takes Sy into y. 
Similarly, U(y — Sy) takes Sy into —y. 

Our conclusion will thus follow from the special cases already considered 
if we can show that at least one of y + Sy is a solution of (3.1). We have, 
however, 

f(y = Sy) = 3(y' + y’S’) A(y + Sy) = 2f(y) + y'ASy = 2a + y'ASy, 
and by proper choice of the sign f(y + Sy) is not divisible by p, if p > 2, or 
8, if p = 2. 

This completes the proof for p > 2. For p = 2 we need to prove 

4l(y’ + y'S’)A, 
i.e., if y’S’ = (m1, 2, 73), 2|/(1 + m). This follows from (5.6) and f(y) = f(Sy). 


6. The groups and the automorphs. We construct automorphs with 
certain desired properties, making use of the assumption that f is indefinite. 


Lemma 5. If f has an automorph S with denominator q in T4, then q © y(f). 
Conversely, suppose q € y(f), whence by the definition of y(f) there exists a w 
with 


wld, wg € n I'(p, f). 
pid 
Then for every such w there exists an automorph S of f with f(S) = wg. 


Proof. With v(S) = wq, w = + (v,d), (g,d) = 1, the denominator of S 
must, if it is in Ty, be g or —g. For by Lemma 2 it must be gu;? or }qu’, u1 
some factor of u, and so it is square-free only if, with u; = 1 or 2, it is equal 
to g. Then the hypothesis of the first part of the Lemma gives, with Lemma 4, 


v= wg € n r(p,f), 


whence q€ y(f). It turns out that the two cases u; = 1, 2 correspond to the same 
set of possible values of v, which simplifies our proof of the second assertion. 

If now v = wq € I'(p, f), suppose first that v is in the set of generators of 
I'(p, f) defined in §3. Take any solution of (£1) with 6 satisfying (3.3) and 
construct the automorph S(0,¢) = S:, say. Plainly »(S,;) = 1 is such that 
v-v, is a quadratic residue modulo p or 8, if p = 2, and S,; must have denomina- 








602 B. W. JONES AND G. L. WATSON 


tor prime to p, though possibly not square-free. We thus have a solution 
with lo = 0 of 


(6.1) ts — df(t) = u’v (mod p*), 
(6.2) 2t5 I + tA t — dit'A = 0 (mod #*), 
for a, 8 with p*||u*», 8 = a + 2+ (—1)?. 


By multiplying together two or more such automorphs S,; we can construct 
a solution of (6.1), (6.2) when v, though in ['(, f), is not one of the generators. 
From the solution thus found, we can obviously construct another in which u 


has no factor prime to p. 
Next, if 


ve NTeA 


we can find é, t, u so that all the pairs of congruences (6.1), (6.2), for p ranging 
over the prime divisors of d, hold simultaneously; and so that u is a product of 
powers of primes all dividing d. Suppose now that we can solve 


(6.3) xo — df(x) = u’v; x0, x = to, t (mod p*), 


for each p|d. Then from (4.2), (4.3) it is clear that S(xo, x) has denominator 
prime to d, while the number r of Lemma 2 is g, so the denominator cannot 
be 4r = 3q¢ but must be g. Thus S(xo, x) will give us all we require. 

We have therefore only to prove the solubility of (6.3). We note first the 
obviously necessary congruence condition, namely the solubility of 


(6.4) xo — df(x) = u’v (mod p”), 


for every prime power p’*, subject to the restriction, vacuous if p’ { d, that 
the solution must satisfy (6.3)2. The solubility of (6.4) for p’ 4 d is obvious 
from (2.3); for, using (2.3), (5.4) becomes 


x — dxyx_ — dx, = wu’. 
So we take p’ = p, p\d, and we may suppose @ = 8 by elementary properties 
of quadratic residues. Now the desired solution of (6.4), (6.3)2 is xo, x = to, t. 
Hence the necessary condition is satisfied, and the proof is completed by 
remarking that it is also sufficient. For x 9? — df(x) is a non-degenerate, 


indefinite form in more than three variables, and so a recent result of one of us 
(13) gives what is required. 


7. Forms in the genus of f. Let the forms f, f; be in the same genus. This 
means, by the classical definition, that f goes into f,; by a rational unimodular 
transformation with denominator prime to 2d. Then by the classical theory we 
know that the transformation may be chosen so as to have its denominator 
prime to any prescribed positive integer. We can therefore find R, Re, with 
denominators r;, r2, such that 


(7.1) fil€) = f(Rié) = f(R2), (r1,@) = (72, dr) = 1. 




















INDEFINITE TERNARY QUADRATIC FORMS 603 


The following lemmas will tell us more about the possible values of the 
denominator. 


LemMMA 6. Suppose that q is square-free and prime to d and that grz is a 
quadratic residue modulo every odd prime factor of dr;, and also modulo 8 if dr, 
is even. Then Q may be found so that f,(§) = f(QE) and Q has denominator q. 


Proof. By (7.1), R2R:-' and its reciprocal R,R;-' are automorphs of f 
whose denominators, dividing 7r,’72, r2*r1, are prime to d and hence equal, 
by the last part of Lemma 2. Hence it is easily seen that each denominator 
must be equal to rir2, whence by Lemmas 1 and 2 we must have 


(7.2) R:R;i' = S(yo, ¥), yo — df(y) = Usuiwr irs, 


with w|d, uw; = 1 or 2, (u;, d) = 1 and uw having no prime factor that does not 
divide d. 
The conditions on g ensure that we can solve for @, 


(7.3) uir’’ = q (mod d‘uswr'), (0,dr;) = 1. 
We now seek a solution of 

(7.4) x — df(x) = ujwrig, 

subject to 

(7.5) Xo = Oyo, x = Oy (mod ujwr'). 


By the result used in the proof of Lemma 5, (7.4) and (7.5) are soluble if, 
for every prime p, with a = a(p) such that p*||uo*wr,*g, (7.4), treated as a 
congruence modulo *, has a solution consistent with (7.5) for any prescribed 8. 
Now if p + dr, (7.5) is vacuous and the solubility of (7.4) modulo * is trivial. 
On the other hand, if pldr;, (7.2) and (7.3) show that x» = Oyo, x = Oy is 
such a solution for a value of 8 certainly not less than a + 3. We need not, 
by elementary properties of quadratic residues, consider any greater value of 
8; so (7.4) and (7.5) are simultaneously soluble. 

We now show that Q = S(xo, x)R:, which clearly takes f into f;, has de- 
nominator g. By (4.3), with u*v = u»*wrig, and the corresponding equation 
for S(yo, y), and (7.3), we have the following congruences, in which the matrices 
occurring on both sides of the congruences are integral: 


reugwrigS (Xo, x)= rouswr.gR2R; (mod uiwr,), 
grir2Q = grireR. = 0 (mod 1;). 


Hence r2gQ is integral, and so is gQ, since it has denominator prime to rp. 
On the other hand, since R; is unimodular with denominator prime to gq, 
if Q had as denominator a proper divisor of g, so would S(x,, x), contradicting 
Lemma 2. This completes the proof. 


DEFINITION. y(f1,f) is the set of all g in Ty that are denominators of 
matrices taking f into f;, f; being any form in the genus of f. 








604 B. W. JONES AND G. L. WATSON 


Lemma 6 shows that the set y(/:, f) is not empty; we prove 
LEMMA 7. +(f1,f) is a coset of y(f) in T¢. 


Proof. We first show that if g, and ge are in y(f1, f) then qi-qe is in y(/). 
In case (g:, g2) = 1, this is clear; for with an obvious notation we see that the 
automorph Q.0;~' of f must, as in the proof of Lemma 6, have denominator 
Q192 = 9°92 € (f). 

If (¢:, g2) > 1, we may argue as above with a suitably chosen g; prime to 
g2 in place of g:. We have only to make gq; satisfy the sufficient condition of 
Lemma 6, with any r; prime to qig2, and with g: for rz. This sufficient condition 
obviously ensures that g2 and q; belong to the same coset of y(f) in Ty; and 
the conclusion of Lemma 6 gives gs € y(/f1,f). 

Conversely, we show that, with g:, y(/:,f) contains all g2 in the coset to 
which gq, belongs. This follows for gz = gogi, (Go, 91) = 1, go € yf), if we 
replace Q,; by SQ,, S being an automorph of f with denominator go. When q2 
does not satisfy these conditions, we consider, as in the first part of the proof, 
a suitable g; that does. The conclusion follows. 

We note some properties of the cosets y(/1, f). 


LemMA 8. y(f1,f) depends only on the classes of fi, f, vUf.f) = v(V), 
V(f, fi) = vf, f) and y(fi, fe) = vf f)- raf). 


Proof. The first two assertions are obvious. The first part of the proof of 
Lemma 6 shows that R; and R;~' must have the same denominator r;; hence 
the third assertion, taking r; to be in y(f:,f). To prove the last assertion, 
consider coprime representatives of the cosets y(/fi,f), y(f2,f) and multiply 
the corresponding Q. 


8. Proof of Theorem 1; a set of forms representing the genus. Theo- 
rem 1 follows from Lemmas 7, 8 if we show that every g in I’, is the denominator 
of a matrix taking f into a form in its genus. We do this by constructing 
certain forms which we shall also use for the proof of Theorem 2. 

We may suppose, see (2.3), that, for any prescribed gq in Ty, 


(8.1) f(&) = fits + dé (mod gq’). 


Putting x» = 0, u*v = —dg, and R for S in (4.2), (4.3), we define an automorph 
R = R(x) of the form on the right of (8.1), for any x for which 


(8.2) %1X_ + dx; = q 
holds, by 
(: 1 0 
(8.3) g(I + R) = xx 10 0) 
0 0 2d 


By Lemma 2, R has denominator g, and we note that R? = J. Now we define 
¥ = ¥(f, R) by 
(8.4) ¥(é) = f(RE) = f(€) (mod q). 











\ 
t 
f 
I 











INDEFINITE TERNARY QUADRATIC FORMS 605 


When q is odd, y is in the genus of f, by the classical definition. This is also 
true for even g; to prove it we use the automorph S(4, x) of /, and we verify 


from (4.2), (4.3), (8.1), (8.2), (8.3) that S(4,x)R has an odd denominator 
prime to d. 


For integral ¢, Ré is integral if 


0 0 2d 


By Lemmas 7, 8, the class of y depends only the coset of y(f) in I’, to which g 
belongs; so we are free to choose x so that any £ in which we are interested 
satisfies (8.5). 


010 
(8.5) ( 0 ) & = Xito + Xoti + 2dxst; = 0 (mod g). 


LEMMA 9. Given any q in Tl, and any 6;, 02 # 0, 0, there exists an x satisfying 
(8.2) such that, for R = R(x) defined by (8.3), RE is integral when 


(8.6) £00; + 2dts010. — dt\0; = 0 (mod gq). 


Proof. Without loss of generality we may suppose @; = 1. Then (8.6) 
becomes (8.5) and (8.2) holds, if x:, x2, x3 = 1, g — dO", Oe. 

We note that, regarding £ as given and (8.6) as a quadratic congruence for 
the ratio 6;: 2, the discriminant of the congruence is 4d(£,f. + dé;*), which by 
(8.4) is congruent to 4df(~) modulo g. 


9. Representation of integers. We study in this section the representa- 
tion of an integer nm + 0 by forms in the genus of f; and as in the statement of 
Theorem 2 we write dn = n,n,”, n, square-free. We prove three lemmas and 


deduce Theorem 2. 


LEMMA 10. Suppose f; is the genus of f, gq © v(fi,f), and f represents n. 
Then a sufficient condition for f, :o represent n is that (dn\p), the Legendre 
symbol, is 0 or 1 for each odd prime p dividing q. 


Proof. For any & with f(¢) = m we can solve (8.6) by the hypothesis of this 
Lemma (see the remark following Lemma 9). With the x, R(x) whose existence 
is asserted by Lemma 9, consider the form y = y¥(f, R) defined by (8.4). 
We have 
(9.1) n = f(&) = ¥(R“E) = ¥(RE), 
and R¢ is integral; so ¥ represents m. But since g © y(f:,f) and f goes into y 
by R with denominator g, we see from the definition of y(/:, f) and Lemmas 
7, 8 that y is equivalent to f,; hence f; represents n. 


LemMA 11. [If f represents n, then so does f; in the genus of f if there exist 
Go, Qi Satisfying 
(9.2) gol|2m2, go © Ta gi © ¥(furf), (Gogs, m1) = 1, 
(9.3) II (m1, goq1)> = 1 








606 B. W. JONES AND G. L. WATSON 


where the symbol in (9.3) is the Hilbert Symbol (6, p. 27) and the product is 
over all primes p dividing n, if d is odd or nm, # —1 (mod 4); over all primes 
dividing 2n, if d is even and n, = —1 (mod 4). 


Proof. By Dirichlet’s theorem we may choose a positive prime 

(9.4) D: = gogi (mod d’ni), p: 1 2m. 

Then, taking g = gop: in Lemma 10, we need only verify (dn, p1),, = (m:|p1) = 1. 
Now (9.3) and (9.4) imply if d is odd or m, # —1(mod 4), 


1 = I] (1, 9091)» = I] (m1, Pi)». 


Pini 
By a fundamental property of the Hilbert symbol the last product is equal 
to (1, P1)p, (m1, P1),. multiplied by (m1, pi)2 if m, is odd; that is (m,|p:) or 
(m3|P1) (m1, Pi)2 according as m, is even or odd. But (m, pi)2 = 1 if m =1 


(mod 4), while if 2; = —1(mod 4) and d is odd we may choose p; = 1(mod 4) 
consistent with (9.4). The case d even and m, = —1(mod 4) is similar but 
simpler. 


Putting go = g:1 = 2, we see that f and f/f; represent the same integers if 
2 € v(/f:,f). Notice that if d is odd and m, = —1(mod 4) we can choose 
~i(mod 4) consistent with (9.4) so that 


1 = (m|pi) = (m, pi)2 TT (m1, 9091)» 
whatever the value of the product in (9.3). 


LEMMA 12. Suppose f represents n. Then all forms in the genus of f represent 
n if (for suitable p,, p2) one of the following holds: 


(9.5) q € T(bs,f), (g, mid) = 1, pelt, (m1, ¢9)>, = —1, 
(9.6) nm, < Oor nm, = —1 (mod 4) with d odd, 
(9.7) q € v(f), (gq, m:) = 1, [1 (m1, g)p = —1, 
(9.8) Pil2ne, (pi, md) = 1, I1(m:, p:), = —1, 


where the products are over the same range as in (9.3). 


Proof. We first show that f, represents m if (9.5) holds and 
(9.9) (q\p) = 1 for each odd p # p2 dividing dm, and 
q = 1 (mod 8) if p2 # 2. 
We have from (9.5);, in case p2|d, and (9.9), that 
g€NTre/; 
pid 

thus g is in the subgroup of y(f) for which w may be taken equal to 1 in the 
definition of y(f). Thus if the product in (9.3) with go = 1 is —1, then the 
product I1(#;, ¢:9), is +1 since 

IT (1,9). = I] (m. 9), = - 1. 


p|2n1 pin 





| weeps ae 








es 


nt 


ie 
ie 





INDEFINITE TERNARY QUADRATIC FORMS 607 


But g € v(f) implies gig € y(/:,f) and hence (9.5) and (9.9) are sufficient. 
But if (9.5) is soluble at all it must have a solution satisfying (9.9); for the two 
formulae, for given p2, are congruences for g to coprime moduli. Hence (9.5) 
alone is sufficient. 
Next (9.6). follows from the remark after the proof of Lemma 11. Suppose 
n, < 0. Then 
I] (m, —1), = (m, —1). = —1 


pini 
if m; is even and — (m;, —1)2 if m; is odd. Both are —1 unless nm; = —1(mod 4). 
We have just excluded m, = —1(mod 4) when d is odd. If d is even the product 
in (9.3) with gog: replaced by —1 is over primes dividing 2m, and hence is 
—1. Thus one of 
IT (m1, goq:)p, U1 (m1, —gogs)s 
is +1; we may put —do for go in (9.2), (9.3). 
If (9.7) holds and (9.3) is denied for go = 1, then gig is an element of y(/;, f) 
and (9.3) holds for g; replaced by g qu. 
If (9.8) holds, then (9.3) can be satisfied either with go = 1 or go = ;. 


Proof of Theorem 2. We write as in the foregoing lemmas dn = nn, 
n, square-free; and by the hypothesis of the theorem, some form f in the genus 
considered represents m, but there is at least one form in the genus that does 
not. It follows that none of the sufficient conditions of Lemma 12 can be 
satisfied, while that of Lemma 11 must fail for some fi. 

(i) Condition (9.6) gives us m, > 0. If m = 1, (9.2) can be satisfied with 
go = 1 and some q;, and then (9.3) necessarily holds. Hence m > 1. 

(ii) Suppose psalm, with p2 odd. Then (9.5)3, (9.5), hold if (¢|p2) = — 1 
and so (9.5); must fail for such g satisfying (9.5)... This means that I(x, f) 
does not contain the group I, of » prime to p2. By the definition of d» this 
means that p2|db. 

Now take 2 = 2 in (9.5). Then (9.5); must fail for g = 5(mod 8), with 
which (9.5). to (9.5), can be satisfied. Hence 5 is not in I'(2, f); this means, 
by the definition of do, that 2|d). We have thus shown that p2|m implies 
Poldy and hence m|do. 

(iii) For odd d, m|dy\d and hence we have m= 1(mod 4) by (9.6)s. If 
nm, = 5(mod 8), then (9.8) is satisfied by p: = 2. Hence m, = 1 (mod 8) for 
odd d. 

(iv) By the hypotheses of this part of the theorem, (9.8). and (9.8); hold 
with p; = p; for by (ii) p prime to d cannot divide m,, while II (n;, Pr)» 
reduces to (m;\p:). Hence (9.8), must fail and p = p, does not divide mz. 

(v) From (ii) we seé that (9.7): is implied by (9.7),; hence the failure of 
(9.7) for any f; means that I1(n, q)p = 1 for all g in y(f). This means that 
II (mn, q), has a fixed value +1, say x(f:,f), for all g in y(fi,f). Now Lemma 11, 
with go = 1, tells us that f; represents m if x(f:, f) = 1. We see from Lemma 8 








6C8 B. W. JONES AND G. L. WATSON 





that this condition holds for just half the classes in the genus. It may be, 
however (since the condition of Lemma 11 is only sufficient), that n is repre- 
sented by some form f; with x(fz,f) = —1. If so, we have to show that, 
contrary to hypothesis, all forms in the genus represent nm. But Lemma 11, 
with this assumption regarding f2, shows that either x (fi, f)=1 or x(/i, fz) =1 
is sufficient for representation of m by f;. And from Lemma 8 it is clear that 


xUfi, fe) = xUuf) xU2f) = — xUf). 


The proof of the assertion (v) is thus complete. 


REFERENCES 
1. P. Bachmann, Die Arithmetik der quadratischen Formen (Leipzig and Berlin, 1925). 
2. H. Brandt, Ueber Stammfaktoren bei terndren quadratischen Formen, Ber. Ver. Sachsischen 


Akad. Wiss. zu Leipzig, Math.-nat. Klasse, 100.1 (1952), 24 pp. 
3. A. Cayley, A memoir on the automorphic linear transformation of a linear bipartite quadric 
function, Phil. Trans. Roy. Soc. London 148 (1858), 39-46. 
- Martin Eichler, Quadratische Formen and orthogonale Gruppen (Berlin, 1952). 
5. Ch. Hermite, Sur la théorie des formes quadratiques ternaires indéfinies, Jour. fiir Math. 
47 (1854), 307-312. 
6. B. W. Jones, The arithmetic theory of quadratic forms (New York, 1950). 
7. B. W. Jones and Donald Marsh, Automorphs of quadratic forms, Duke Math. J. 21 (1954), 
179-193. 
8. C. C. MacDuffee, The theory of matrices (Berlin, 1933). 
9. Donald Marsh, An investigation of the number of classes in the genus of certain indefinite 
ternary quadratic forms, unpublished thesis, University of Colorado (1953). 
10. A. Meyer, Ueber indefinite terndre quadratische Formen, J. fiir Math. 113 (1894), 186-206; 
114 (1895), 233-254; 115 (1896), 150-182; 116 (1896), 307-325. 
11. Gordon Pall, On generalized quaternions, Trans. Amer. Math. Soc. 59 (1946), 280-332. 
Also, Quaternions and Sums of Three Squares, Amer. J. Math. 64 (1942), 503-513. 
12. T. J. Stieltjes, Un théoréme d'algébre, Acta Math. 6 (1955), 319-320. 
13. G. L. Watson, Representation of integers by indefinite quadratic forms, Mathematika 2 
(1955), 32-38. 


+ 


Queen Mary College, London University College 
and London 
University of Colorado 




















. 
. 
‘_- ee ene pe 
a ° . 
] i 
. . ‘ : 
y ' 
a iG 
‘ ‘ te ~ 
° ‘ 
‘ 
‘ : 
£ 
; 
’ be 
a e 
‘ 
. ‘ 
° 4 ‘ 
1 e 
. 
‘ . 
. . 
‘ 
. 
sa 
. - . 
as 
° —, — @- . 
“= 
. . 
. 
. ——— 
. 
> i 
. 
‘ * 
s . 
) 
’ 
¥ . 
a 
Ww 
« . 
- J 
Amer: 7 
i 
. ’ 
swf 
e 
Py ? 
br 
} ° , 
& » . 
~ . 
‘ . 





